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(54) Cellulose synthase gene 



(57) mRNA was extracted at the stage for cotton 
plant fibrous cells to accumulate cellulose, and cDNA's 
complementary thereto were synthesized to construct a 
cDNA library. Clones of a number of 750 were arbitrarily 
selected from the library, and they were randomly sub- 
jected from to sequencing. Those having homology to 



an amino acid sequence deduced from a gene of cellu- 
lose 4-p-glucosyltransferase (bcsA) of cellulose syn- 
thase operon of acetic acid bacterium were selected 
from obtained nucleotide sequences of the respective 
clones. Thus, DNA coding for cellulose synthase was 
obtained. 
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Description 

Technical Field 

5 The present invention relates to a DNA coding for cellulose synthase originating from cotton plant (Gossypium 

hirsutum), a recombinant DNA containing the same, a transformed cell transformed with the DNA, and a method for 
controlling cellular cellulose synthesis. 

Background Art 

10 

Cellulose is used for paper, woody structural materials, fiber, cloths, food, cosmetics, and pharmaceuticals, as well 
as it is utilized as energy. Therefore, cellulose is industrially useful and valuable. Cellulose is capable of forming a 
variety of crystalline structures, and hence it is expected to develop a new material by controlling enzymes involved 
in biosynthesis of cellulose. The cellulose-related industry has been hitherto directed to such cellulose products that 

'5 have been already produced, in which there has been no trial to develop a new material based on an aspect of bio- 
synthesis. The mechanism of disease action, which is exerted by pathogenic microorganisms on plants, often results 
from the inhibition on cellulose biosynthesis as in Pyricularia oryzae (P. oryzae). Therefore, the addition of disease 
resistance to the cellulose biosynthesis mechanism is agriculturally applicable and valuable. Further, cellulose is the 
most abundant organic compound on the earth, and it is a sink in which the largest amount of C0 2 in the atmospheric 

20 air is fixed. Therefore, the genetic improvement of cellulose biosynthesis enzymes is also applicable to the industry 
which is directed to the control of C0 2 in the atmospheric air based on the use of cellulose as the sink. 

In recent years, cDNA's originating from fiber cells of cotton plant have been randomly sequenced, and it has been 
reported that full length CelA1 and partial length of CelA2 probably represent cDNAs of cotton plant cellulose synthase, 
in view of the homology to bacterial cellulose synthase gene (bacterial BcsA) (Pear et al., Proceeding of National 

25 Academy of Science, USA (1996) 93 12637-12642). The binding ability to UDP-glucose has been demonstrated for 
CelA1 . However, as for CelA2, the homology has been merely demonstrated for the C-terminal amino acid sequence. 

Disclosure of the Invention 

30 The present invention has been made in order to provide a new method for regulating cellulose production in 

prokaryotic cells or eukaryotic cells, an object of which is to provide a DNA coding for cellulose synthase, a recombinant 
DNA containing the same, a transformed cell transformed with the DNA, and a method for regulating cellular cellulose 
synthesis. 

The present inventors firstly extracted mRNAs at the stage for cotton plant fiber cells to accumulate cellulose, and 
35 cDNAs complementary thereto were synthesized to construct a cDNA library 750 of cDNA clones were arbitrarily 
selected from the library, and they were randomly subjected to sequencing. Six amino acid sequences were derived 
for one nucleotide sequence of each of the obtained clones to select those having homology to an amino acid sequence 
obtained by translation from a gene of cellulose 4-p-glucosyltransferase (bcsA) of cellulose synthase operon of aceto- 
bacterium. As a result, genes, which were classified into three types or groups, were found, and they were designated 
40 as PcsA1 , PcsA2, and PcsA3 respectively (PcsA is an abbreviation of "Plant Cellulose Synthase A"). 

That is, the present invention lies in a DNA coding for any one of the following proteins (A) to (C): 

(A) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID NO: 
2 or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino acids 

45 relevant to SEQ ID NO: 2; 

(B) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID NO: 
4 or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino acids 
relevant to SEQ ID NO: 4; and 

(C) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID NO: 
50 s or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino acids 

relevant to SEQ ID NO: 8, and comprising an amino acid sequence shown in SEQ ID NO: 11 or an amino acid 
sequence involving deletion, substitution, insertion, or addition of one or several amino acids relevant to SEQ ID 
NO: 11. 

ss in another aspect, the present invention provides a recombinant vector comprising all or a part of the DNA as 

defined above, and a transformed cell transformed with the DNA as defined above. 

In still another aspect, the present invention provides a method for regulating cellulose synthesis in a cell, com- 
prising the steps of introducing the DNA as defined above into the cell, and expressing RNA having a nucleotide 
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sequence homologous to the DNA as defined above or a nucleotide sequence complementary to the DNA as defined 
above. 

SEQ ID NO: 1 corresponds to a sequence of PcsA1, and SEQ ID NO: 3 corresponds to a sequence of PcsA2. 
SEQ ID NO: 5 corresponds to a sequence of 3'-side region of PcsA3, SEQ ID NO: 7 corresponds to a sequence of 5'- 
side region of PcsA3, and SEQ ID NO: 9 corresponds to a sequence of internal region of PcsA3. 

It has been demonstrated that PcsA1 and PcsA2 of the DNA's described above are DNA's coding for cotton plant 
cellulose synthase, according to the expression in eukaryotic cells (animal cells and/or yeast). It has been also dem- 
onstrated that an antibody thereagainst inhibits the cotton plant cellulose synthase activity in a cell-free system. Further, 
PcsA3, which is different from PcsAI and PcsA2, has been found. Any one of these species was obtained as partial 
one, at the stage of clones obtained by the random sequencing, and no 5'-portion of the coding region was contained. 
Therefore, clones which have sequences of 5'-portions were isolated in accordance with the S'-RACE method based 
on the use of PCR to determine the sequences. As a result of this operation, the sequences of the S'-portions corre- 
sponding to the partial length clones were obtained for PcsAI and PcsA2. 

On the other hand, as for PcsA3, a sequence of a 5'-portion of another clone, which was considered to belong to 
the same PcsA3 group, was obtained. The both sequences had extremely high homology, and hence they were con- 
sidered to have underwent multiple gene formation relatively recently originating from an identical gene through the 
process of duplication. Therefore, even when the both are combined with each other at corresponding portions to 
construct a fused gene followed by expression, it is assumed that the activity and function of a produced enzyme may 
not be affected thereby. 

As for PcsAI and PcsA2, in order to obtain a full length clone, primers were designed on the basts of the sequence 
of the 5'-portion and the sequence of the 3'-portion of the partial length clone to perform PCR. Thus, a clone containing 
ORF was obtained. 

Those applicable as the template to be used for the RACE method may be any of cDNA synthesized from mRNA 
and a phage library. When the phage library is used, it is possible to use a sequence in the vector as a 5'-side primer. 

As a result of random sequencing, seven clones concerning PcsA2 were most abundantly present, of 15 clones 
seemed to code the cellulose synthase. Expression was confirmed in eukaryotic cells (animal cells and/or yeast) trans- 
formed with the cellulose synthase gene. As a result, the cellulose synthase activity was observed. 

The present invention will be explained in detail below. 

<1 > Preparation of cotton plant cDNA library 

Cotton plant fiber cells at the stage of cellulose accumulation are preferably used as a material for extracting mRNA 
to construct a cotton plant cDNA library. The method for extracting mRNA is not specifically limited, for which it is 
possible to adopt an ordinary method for extracting mRNA from plant! 

cDNA can be synthesized, for example, by using a poly T sequence which is complementary to poly A nucleotide 
existing at the terminal of mRNA as a primer to synthesize complementary DNA by the aid of reverse transcriptase, 
and forming a double strand by the aid of DNA polymerase. 

The method therefor is described, for example, in Molecular Cloning (Maniatis et al., Cold Spring Harbour Labo- 
ratory). However, a variety of cDNA synthesis kits are commercially available from various companies, which may be 
used. 

Generally, the library is constructed by using a phage vector A variety of commercially available vectors are usable. 
However, it is preferable to use a vector, for example, XZAP vector in which it is unnecessary to perform recloning from 
the vector, and it is possible to immediately prepare a plasmid for sequencing. 

<2> Determination of nucleotide sequence of cDNA 

Clones are randomly selected from the obtained cDNA library to determine nucleotide sequences of inserts in the 
clones. The nucleotide sequence can be determined in accordance with the Maxam -Gilbert method or the dideoxy 
method. Among them, the dideoxy method is more convenient and preferred. 

The nucleotide sequence can be determined in accordance with the dideoxy method by using a commercially 
available sequencing kit. Further, the use of an automatic sequencer makes it possible to determine sequences of a 
large number of clones for a short period of time. 

It is unnecessary to determine the sequence for an entire length of the insert. It is enough to determine a length 
of nucleotide sequence which is considered to be sufficient to perform homology search. For example, in Examples 
described later on, the homology search as described below was performed when a sequence having not less than 
60 nucleotides was successfully determined. 
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<3> Homology search with gene data base 

The determined nucleotide sequence of each of cDN A clones is used to perform the homology search with respect 
to known amino acid sequences of the cellulose synthase or nucleotide sequences of genes coding therefor registered 

5 in the gene data base. The cellulose synthase is exemplified by an enzyme encoded by a gene of cellulose 4~p- 
glucosyltransferase (BcsA) of cellulose synthase operon of acetobacterium (Wong, H. C. et al., Proc. Natl. Acad. Sci. 
U.S.A. , 87, 8130-8134 (1990), ACCESSION No. M37202). 

Those usable as the data base include ) for example, GenBank, EMBL, and DDBJ published, for example, from 
Los Alamos National Institute in the United States, Institute of European Molecular Biology, and National Institute of 

10 Genetics (Japan). Those commercially available and useable as the program for homology search include, for example, 
commercially available DNA analysis softwares, such as DNASIS (Hitachi Software Engineering Co., Ltd.) and GENE- 
TYX (SDC Software Development). The following methods are also available. That is, a computer terminal is connected 
with the host computer of National Institute of Genetics to perform analysis. Alternatively, a personal computer is con- 
nected on Internet with NCBI (National Center for Biotechnology Information) to utilize (http://www.ncbi.nlm.nih.gov/ 

15 BLAST/) BLAST (Basic Local Alignment Search Tool) so that high speed homology search is performed. 

The homology search is performed, for example, in accordance with the following algorithm. When the homology 
search is performed for a nucleotide sequence, homology comparison is advanced while shifting the nucleotide se- 
quence to be investigated by every one nucleotide with respect to individual gene sequences included in the data base. 
When six or more continuous nucleotides are coincident, the homology score is counted and calculated in accordance 

20 with a homology score table (see, for example, M. Dayhoff, Atlas of Protein Sequence and Structure, vol. 5 (1978)). 
The system is set so that those having a score not less than a certain value are picked up as candidates which have 
homology. Further, the gap may be introduced into the sequence to be investigated or into the gene sequence included 
in the data base to make optimization so that the score is maximized. 

When the homology search is performed for an amino acid sequence, a nucleotide sequence to be investigated 

25 is converted into amino acids concerning all six frames including those of a complementary chain. The investigation 
may be performed in the same manner as performed for the nucleotide. Specifically, it is possible to use blastx of 
BLAST described above. As for detailed techniques and conditions for the search, reference may be made to DDBJ 
News Letter, No. 15 (February 1995). 

30 <4> Isolation of cDNA clone of cotton plant cellulose synthase 

The clone obtained as described above is not necessarily contain the entire nucleotide sequence of the gene. In 
such a case, the clone is used as a probe to perform screening by means of plaque hybridization. Thus, it is possible 
to obtain a clone containing a full length gene from the library. A specified method may be carried out with reference 
35 to Molecular Cloning, second edition (Maniatis et al,, Cold Spring Harbour Laboratory) 12.30 to 12.40. 

When obtained cDNA is deficient in S'-portion, the S'-portion can be obtained as well by synthesizing primers so 
that the cDNA sequence may be elongated toward the S'-terminal, and performing RT-PCR by using mRNA as a 
template. 

As demonstrated in Examples described later on, the DNA of the present invention has been obtained as those 
40 having homology to the known bacterial cellulose synthase gene. The DNA further codes for an amino acid sequence 
GlnXXXXXXArgTrp (SEQID NO: 12) which is considered to form a UDP-glucose binding domain, having high homology 
in the vicinity thereof. 

The nucleotide sequences of DNA of the present invention obtained as described above and the amino acid se- 
quences deduced from the nucleotide sequences are shown in SEQ ID NOs: 1 to 10 in Sequence Listing. SEQ ID 

45 NOs: 1 and 3 show nucleotide sequences of PcsAI and PcsA2 respectively. SEQ ID NOs: 2 and 4 show amino acid 
sequences deduced from the nucleotide sequences of PcsAI and PcsA2 respectively. 

SEQ ID NOs: 5 and 6 show a nucleotide sequence of a clone (PcsA3-682) containing 3'-side region of PcsA3 and 
an amino acid sequence deduced from the nucleotide sequence respectively. SEQ ID NOs: 7 and 8 show a nucleotide 
sequence of a S'-portion (PcsA3-5') of another clone containing S'-side region of PcsA3 and an amino acid sequence 

50 deduced from the nucleotide sequence respectively. SEQ ID NOs: 9 and 10 show a nucleotide sequence of 3' -portion 
(PcsA3-3') of the clone and an amino acid sequence deduced from the nucleotide sequence respectively (see Fig. 1 ). 
That is, SEQ ID NO: 5 corresponds to the 3' -side region of PcsA3, SEQ ID NO: 7 corresponds to the S'-side region of 
PcsA3, and SEQ ID NO: 9 corresponds to internal region of PcsA3. The overlapping portion of PcsA3-682 is different 
from that of PcsA3-3' in 9 nucleotides in the nucleotide sequence and 1 amino acid in the amino acid sequence. Figs. 

5S 3 and 4 show the comparison between the nucleotide sequences of PcsA3-682 and PcsA3-3*. SEQ ID NO: 11 shows 
a combination of the amino acid sequences encoded by PcsA3-682 and PcsA3-3*. 

The sequence of GlnXXXXXXArgTrp (SEQ ID NO: 12) corresponds to amino acid numbers 710 to 714 in SEQ ID 
NO: 2 for PcsAI , amino acid numbers 778 to 782 in SEQ ID NO: 4 for PcsA2, and amino acid numbers 356 to 360 in 
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SEQ ID NO; 6 for PcsA3. 

PcsA1 is different from CelA1 reported by Pear et al. (Proceeding of National Academy of Science, USA (1996), 
93 , 1 2637-1 2642) in nucleotide sequence by 28 nucleotides. As a result, the former is different from the latter in amino 
acid sequence encoded thereby by 10 amino acid residues. In general, the sugar chain specificity and the substrate 

s specificity of the sugar chain transferase are extremely changed by point mutation of the nucleotide of DN A (Yamamoto 
and Hakomori, The Journal of Biological Chemistry (1990) 265, 19257-19262). Therefore, it is unclear whether or not 
CelA1 codes for a protein having the cellulose synthase activity. Incidentally, the 48th Arg, the 56th Ser, the 81st Asn, 
the 104th Ala, the 11 0th Ser, the 247th Asp, the 376th Asp, the 386th Ser, the 409th Arg, and the 649th Ser in the 
amino acid sequence encoded by CelAI correspond to Gin, lie, Ser, Thr, Pro, Asn, Glu, Pro, His, and Gly in PcsA1 

10 respectively. 

PcsA2 of the present invention contains the same sequence as that of Ce!A2 reported by Pear et al. However, 
CelA2 has an incomplete length, and it does not contain the entire coding region. CelA2 corresponds to nucleotide 
numbers of 1083 to 3311 in the nucleotide sequence of PcsA2 shown in SEQ ID NO: 3. 

Any of the amino acid sequences shown in SEQ ID NOs. 2, 4, 6, 8, 10, and 11 is a novel sequence. All genes 

15 having nucleotide sequences coding for the amino acid sequences are included in the present invention. 

The amino acid sequences described above may include deletion, substitution, insertion, and/or addition of one 
or more amino acid residues provided that the characteristic of the gene of the present invention is not substantially 
affected. The deletion, substitution, insertion, and/or addition of one or more amino acid residues as described above 
is obtainable by modifying the DNA's coding for the amino acid sequences shown in SEQ ID NOs: 2, 4, 6, 8, 10, and 

20 1 1 randomly in accordance with the ordinary mutation treatment or intentionally in accordance with the site-directed 
mutagenesis method. As described above, in general, the sugar chain specificity and the substrate specificity of the 
sugar chain transferase are extremely changed by point mutation of the nucleotide of DNA. Therefore, DNA coding 
for a protein having the cellulose synthase activity is selected from the modified DNA's. The cellulose synthase activity 
can be measured, for example, by means of the method described by T Hayashi: Measurinq-3-glucan deposition in 

25 plant cell walls, in Modern Methods of Plant Analysis: Plant Fibers, eds. H. F. Linskens and J. F. Jackson, Springer- 
Verlag, 10: 138-160 (1989). 

Those harboring proteins or genes partially different from the sequences shown in Sequence Listing may exist 
depending on, for example, the variety of cotton plant or natural mutation. However, such genes are also included in 
the gene of the present invention. Such a gene may be obtained as DNA which is hybridizable under the stringent 

30 condition with all or a part of the coding region of the nucleotide sequence shown in SEQ ID NO: 1 , 3, 5, 7, or 9. The 
"stringent condition" referred to herein indicates a condition under which a so-called specific hybrid is formed, and non- 
specific hybrid is not formed. It is difficult to definitely express such a condition by using a numerical value. However, 
for example, the stringent condition is exemplified by a condition under which nucleic acids having high homology, for 
example, DNA's having homology of not less than 80 % undergo hybridization with each other, and nucleic acids having 

35 homology tower than the above do not undergo hybridization with each other. 

<5> Utilization of gene of the present invention 

The DNA of the present invention makes it possible to control the cellulose synthesis in prokaryotic cells such as 
40 acetobacterium and/or eukaryotic cells such as yeasts belonging to, for example, the genus Saccharomyces , cells of 
plant such as cotton plant, and cultured cells of mammals and the like. 

Specifically, the cellulose synthesis in the cells as described above can be facilitated, for example, by connecting 
a promoter to an upstream region of the DNA of the present invention, inserting an obtained fragment into an appropriate 
vector to construct a recombinant vector, and introducing the vector into the cells. Alternatively, the cellulose synthesis 
45 jn the cells can be suppressed by introducing an antisense gene of the DNA of the present invention into the cells. 

The promoter and the vector may be selected from those ordinarily utilized to express heterogeneous genes, and 
the method ordinarily employed to express heterogeneous genes may be used as the transformation method. Specif- 
ically, in the case of yeast, it is possible to use a protein-expressing kit produced by Invitrogen, i.e., Pichia Expression 
Kit, and a vector pPIC9 contained in this kit. For example, COS7 cells may be used as mammalian cultured cells, and 
50 a vector CDM8 may be used therefor. 

The present invention provides the DNA coding for cellulose synthase. The DNA provides a new method for con- 
trolling cellulose production by incorporating the DNA into prokaryotic cells and eukaryotic cells. 

Brief Description of the Drawings 

55 

Fig. 1 shows a relationship between two clones of PcsA3 as an embodiment of the DNA of the present invention. 
Regions interposed between arrows indicate regions for which nucleotide sequences have been determined. A dotted 
line indicates a region for which no nucleotide sequence has been determined. 
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Fig. 2 shows a structure of Eco RI adapter. 

Fig. 3 shows comparison between sequences of PcsA3-682 and PcsA3-3' (former half). 

Fig. 4 shows comparison between sequences of PcsA3-682 and PcsA3-3' (latter half). ° * indicates coincident 
nucleotides, and indicates non-coincident nucleotides. 

5 

Best Mode for Carrying Out the Invention 

Examples of the present invention will be explained below. 
10 <1 > Preparation of total RNA from cotton plant 

Cotton plant (Gossypium hirsutum L. ) Coker 31 2 was used as a material. Fiber cells on 1 6 to 1 8 days post anthesis 
were collected in liquid nitrogen. The cotton plant fiber cells in an amount of 75 g were sufficiently ground in a mortar 
while being frozen with liquid nitrogen. Powdered fiber was transferred to a centrifuge tube equipped with a cap, to 
15 which 375 mg of DTT as a powder was added, followed by addition of 200 ml of XT buffer (obtained by adjusting 0.2 
M sodium borate containing 30 mM EDTA and 1 % SDS to be pH 9.0, and then applying a diethylpyrocarbonate 
treatment, followed by autoclaving to obtain a solution to which vanadylribonucleoside was added to give a concen- 
tration of 10 mM) having been heated to 90 to 95 °C. An obtained solution was sufficiently agitated. 

The solution was added with 100 mg of protease K, and it was agitated again. The solution was incubated at 40 
20 °c for 2 hours, and then it was added with 16 ml of 2 M KCI. The solution was sufficiently agitated again, and it was 
left to stationarily stand in ice for 1 hour, followed by centrifugation for 20 minutes (4 °C) at 12,000 g by using a high 
speed refrigerated centrifuge. 

An obtained supernatant was filtrated, and floating matters were removed. The solution was transferred to a meas- 
uring cylinder to measure the volume. The solution was transferred to another centrifuge tube, to which lithium chloride 
25 was added in an amount of 85 mg per 1 ml of the extract solution to give a final concentration of 2 M. The solution was 
left to stationarily stand at 4 °C overnight, and then precipitated RNA was separated by centrifugation for 20 minutes 
at 12,000 g. An obtained precipitate of RNA was washed and precipitated twice with cooled 2 M lithium chloride. 

The obtained RNA was dissolved in 10 mM Tris buffer (pH 7.5) to give a concentration of about 2 mg/ml, to which 
5 M potassium acetate was added to give a concentration of 200 mM. Ethanol was added thereto to give a concentration 
of 70 %, followed by cooling at -80 °C for 10 minutes. Centrifugation was performed at 4 °C for 10 minutes at 15,000 
rpm, and then an obtained precipitate was suspended in an appropriate amount of sterilized water to give an RNA 
sample. As a result of quantitative measurement for the RNA sample, total RNA was obtained in an amount of 2 mg. 

<2> Purification of mRNA 

35 

mRNA was purified as a poly(A) + RNA fraction horn the total RNA obtained as described above. Purification was 
performed by using Oligotex-dT30 <Super> (purchased from Toyobo) as oligo(dT)-immobilized latex for poly(A) + RNA 
purification. 

Elution buffer (10 mM Tris-HCI (pH 7.5), 1 mM EDTA, 0.1 % SDS) was added to a solution containing 1 mg of the 
40 total RNA to give a total volume of 1 ml, to which 1 ml of Oligotex<iT30 <Super> was added, followed by heating at 
65 °C for 5 minutes and quick cooling on ice for 3 minutes. The obtained solution was added with 0.2 ml of 5 M NaCI, 
and it was incubated at 37 °C for 10 minutes, followed by centrifugation at 15,000 rpm for 3 minutes. After that, a 
supernatant was carefully removed. 

An obtained pellet was suspended in 2.5 ml of Washing Buffer (10 mM Tris-HCI (pH 7.5), 1 mM EDTA, 0.5 M NaCI, 
^5 o. 1 % SDS), and the suspension was centrifuged at 1 5,000 rpm for 3 minutes. After that, a supernatant was carefully 
removed. An obtained pellet was suspended in 1 ml of TE Buffer, and then it was heated at 65 °C 5 minutes. The 
suspension was quickly cooled on ice for 3 minutes, and then it was centrifuged at 1 5,000 rpm for 3 minutes to recover 
poly(A) + mRNA contained in an obtained supernatant. 

Thus, the poly(A) + mRNA in an amount of about 10 \ig was obtained from 1 mg of the total RNA. An aliquot of 5 
50 jig thereof was used to prepare a cDNA library. 

<3> Preparation of cDNA library 

(1) Synthesis of cDNA 

55 

The mRNA obtained as described above was used as a template to synthesis cDNA by using a XZAP cDNA 
synthesis kit produced by Stratagene. The following solution was prepared and mixed in a tube. 
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5.0 10 x 1st Strand Buffer (buffer for reverse transcription reaction); 

3.0 (al 10 mM 1st Strand Methyl Nucleotide Mix (5-methyl dCTR dATR dGTP, dTTP mixture); 
2.0 uJ Linker-Primer (linker and primer); 
H 2 0 (adjusted to give a total volume of 50 uJ); 
5 1 .0 uJ RNase Block II (RNase inhibitor). 

The respective components described above were contents of the kit. Linker-Primer had a sequence as shown in 
SEQ ID NO: 13. Methylated nucleotide was used because it was intended not to allow cDNA to be digested by the 
restriction enzyme reaction performed later on. The reaction solution was agitated well, and then 5.0 u.g of poly(A) + 
*0 mRNA was added thereto, followed by being left to stand at room temperature for 10 minutes. Further, 2.5 u.l of M- 
MuLV RTase (reverse transcriptase) was added (at this time, the total volume was 50 u.l). The reaction solution was 
gently mixed, followed by centrifugation under a mild condition to allow the reaction solution to fall to the bottom of the 
tube. The reaction was performed at 37 °C for 60 minutes. 

Next, the following solution was prepared and mixed in the tube in a certain order. 

15 

45.0 |il reaction solution containing cDNA primary chain; 
40.0 |il 10 x 2nd Strand Buffer (buffer for polymerase reaction); 
6.0 uJ 2nd Strand Nucleotide Mixture (A, G, C, T mixture); 
302.0 u.l H 2 0. 

20 

The following solution was further added. However, in order to allow RNase and DNA polymerase to simultaneously 
act, enzyme solutions were allowed to adhere to the wall of the tube. After that, a vortex treatment was promptly 
performed, and the reaction solutions were allowed to fall to the bottom of the tube by means of centrifugation to 
perform a reaction for synthesizing cDNA second strand at 16 °C for 150 minutes. 

25 

0.8 u.l RNase H (RNA-degrading enzyme); 
7.5 \x\ DNA polymerase I (10.0 u/ul). 

The reaction solution was added with 400 u. I of a mixed solution of phenol: chloroform (1:1). Agitation was performed 
30 well, followed by centrifugation at room temperature for 2 minutes. An obtained supernatant was added with 400 uJ of 
phenol; chloroform again, which was subjected to a vortex treatment and centrifugation at room temperature for 2 
minutes. An obtained supernatant was added with the following solution to precipitate cDNA, 

33.3 uJ 3 M sodium acetate solution; 
35 867.0 uJ 100 % ethanol. 

The obtained solution was left to stand at -20 °C overnight, and it was centrifuged at room temperature for 60 
minutes. After that, washing was gently performed with 80 % ethanol, followed by centrifugation for 2 minutes. A su- 
pernatant was removed. An obtained pellet was dried, and it was dissolved in 43.5 uJ of sterilized water. An aliquot 
40 (39.0 u.l) was added with the following solution to blunt-end cDNA terminals. 

5.0 uJ 10 x T4 DNA Polymerase Buffer (buffer for T4 polymerase reaction); 
2.5 |il 2.5 mM dNTP Mix (A, G, C, T mixture); 
3.5 \x\ T4 DNA polymerase (2.9 u/ul). 

45 

The reaction was performed at 37 °C for 30 minutes, to which 50 uJ of distilled water was added, and then 1 00 \x\ 
of phenol: chloroform was added thereto, followed by a vortex treatment and centrifugation for 2 minutes. An obtained 
supernatant was added with 1 00 u.l of chloroform, which was subjected to a vortex treatment, followed by centrifugation 
for 2 minutes. The supernatant was added with the following solution to precipitate cDNA. 

50 

7.0 jil3M sodium acetate solution; 
226 uJ 100% ethanol. 

The solution was left to stand on ice for 30 minutes or more, and it was centrifuged at 4 °C for 60 minutes. An 
55 obtained precipitate was washed with 1 50 uJ of 80 % ethanol, followed by centrifugation for 2 minutes and drying. The 
cDNA pellet was dissolved in 7.0 uJ of EcoRI Adaptor solution, to which the following solution was added to ligate the 
EcoRI adapter to both ends of the cDNA. Sequences of respective strands of the EcoRI adapter are shown in SEQ ID 
NO: 14 and Fig. 2. 
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1 .0 fiMO x Ligation Buffer (buffer for ligase reaction); 
1.0 pi 10 mM ATP; 
1.0 |il T4 DNA ligase. 

5 The reaction solution was centrifuged under a mild condition, and it was left to stand at 4 °C overnight or more. 

The solution was treated at 70 °C for 30 minutes, and then it was centrifuged under a mild condition, followed by being 
left to stand at room temperature for 5 minutes. The reaction solution was added with the following solution to phos- 
phorylate S'-terminals of the EcoRI adapter. 

10 1 .0 |il 1 0 x Ligation Buffer (buffer for ligase reaction); 

2.0 uJ 10 mM ATP; 
6.0 |il H 2 0; 

1 .0 jal T4 polynucleotide kinase (10.0 u/uJ). 

75 The reaction was performed at 37 °C for 30 minutes, followed by a treatment at 70 °C for 30 minutes. The solution 

was centrifuged under a mild condition, and it was left to stand at room temperature for 5 minutes. The following solution 
was further added thereto to perform a reaction at 37 °C for 90 minutes so that the Xho l site introduced by Linker- 
Primer was digested with Xho l, followed by being left to stand at room temperature to perform cooling. 

20 28.0 n\ Xhol Buffer; 

3.0 ul Xhol (45 u/uJ). 

The reaction solution was added with 5.0 uJ of 10 x STE (10 mM Tris-HCI (pH 8.0), 100 mM NaCI, 1 mM EDTA), 
which was added into a centrifuge column for removing short fragments (Sephacryl Spin Column) to perform centrif- 
ugation at 600 g for 2 minutes to obtain an eluent which was designated as Fraction 1. This operation was further 
repeated three times to obtain Fractions 2, 3, and 4 respectively. Fractions 3 and 4 were combined, to which phenol: 
chloroform (1:1) was added and agitated well, followed by centrifugation at room temperature for 2 minutes. An obtained 
supernatant was added with an equal amount of chloroform, and an obtained mixture was agitated well. The mixture 
was centrifuged at room temperature for 2 minutes to obtain a supernatant to which a two-fold amount of 100% ethanol 
was added, followed by being left to stand at - 20 °C overnight. The solution was centrifuged at 4 °C for 60 minutes, 
followed by washing with an equal amount of 80 % ethanol. The solution was centrifuged at 4 °C for 60 minutes to 
obtain a cDNA pellet which was suspended in 10 u.l of sterilized water. 

(2) Preparation of cDNA library 

The double strand cDNA obtained as described above was ligated with X phage expression vector to prepare a 
recombinant vector. The following solution was prepared and mixed in a tube to perform a reaction at 1 2 °C overnight, 
followed by being left to stand at room temperature for 2 hours to ligate cDNA with the vector. 

2.5 (il cDNA solution; 
0.5 |il 10 x Ligation Buffer; 
0.5 (aMO mM ATP; 
1 .0 uJ XZAP vector DNA (1 ug/ial); 
0.5 |il T4 DNA ligase (4 Weiss u/|il). 

(3) Packaging of phage DNA into phage particles 

The phage vector containing the cDNA was packaged into phage particles by using an in vitro packaging kit (Gi- 
gapack II Gold packaging extract: produced by Stratagene). The recombinant phage solution was added to Freeze/ 
Thaw extract immediately after dissolution, and the solution was placed on ice, to which 15 u.l of Sonic extract was 
added to perform mixing well by pipetting. The reaction solution was centrifuged under a mild condition, and it was left 
to stand at room temperature (22 °C) for 2 hours. The reaction solution was added with 500 |il of Phage Dilution Buffer, 
to which 20 uJ of chloroform was further added, followed by mixing. In order to measure the titer of the library, an aliquot 
(2 uJ) of 500 |il of the aqueous phase was diluted in a ratio of 1:10 with 18 uJ of SM buffer (5.8 g of NaCt, 2 g of 
MgSo 4 «7H 2 0, 50 ml of 1 M Tris-HCI (pH 7.5), and 5 ml of 2 % gelatin in 1 L). The diluted solution (1 uJ) and the phage 
stock solution (1 uJ) were plated respectively together with 200 p1 of a culture solution of Escherichia coli PLK-F' strain 
having been cultivated to arrive at a value of OD 600 of 0.5. That is, Escherichia coli PLK-F' strain was mixed with the 
phage solution to perform cultivation at 37 °C for 1 5 minutes. The obtained culture was added to 2 to 3 ml of top agar 
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(48 °C), which was immediately overlaid on NZY agar plate having been warmed at 37 °C. Cultivation was performed 
overnight at 37 °C, and appeared plaques were counted to calculate the titer. As a result, the titer was 1 .2 x 10 s pfu/ml. 

(4) Amplification of library 

5 

A centrifuge tube was added with the packaging solution containing about 50,000 recombinant bacteriophages 
and 600 uJ of a culture solution of Escherichia coli PLK-F' strain having been cultivated to have a value of OD 600 of 
0.5, followed by cultivation at 37 °C for 1 5 minutes. The culture solution was added with 6.5 ml of top agar having been 
maintained at 48 °C after dissolution, which was overlaid on 150 mm NZY plate having been warmed at about 37 °C, 

10 followed by cultivation at 37 °C for 5 to 8 hours. The respective plates were added with 10 ml of SM Buffer to perform 
cultivation at 4 °C overnight with gentle shaking. SM Buffer in the respective plates was collected in a sterilized poly- 
propylene tube. The respective plates were rinsed with 2 ml of SM Buffer, and the rinsing solutions were collected in 
the same tube. Chloroform in an amount corresponding to 5 % of the total amount was added and mixed, followed by 
being left to stand at room temperature for 15 minutes. Bacterial cells were removed by centrifugation at 4,000 g for 

J5 5 minutes. An obtained supernatant was added with chloroform in an amount corresponding to 0.3 % of the total 
amount, and it was stored at 4 °C. The titer of the library amplified as described above was measured in the same 
manner as described above. As a result, the titer was 2.3 x 10 9 pfu/ml. 

(5) Excision of plasmid from phage DNA 

20 

In vivo excision of the plasmid portion from the recombinant phage DNA was performed. The following solution 
was mixed in 50 ml of a conical tube to cause infection at 37 °C for 15 minutes: 

culture solution of Escherichia coli XL1-Blue (OD 600 = 0.1) 200 uJ; 
2S phage solution after amplification 200 u,l (> 1 x 10 5 phage particles); 

helper phage R408 1 uJ (> 1 x 10 6 pfu/ml). 

The mixed solution was added with 5 ml of 2 x YT medium to perform cultivation at 37 °C for 3 hours with shaking. 
A heat treatment was applied thereto at 70 °C for 20 minutes, followed by centrifugation at 4,000 g for 5 minutes. An 

30 obtained supernatant was decanted and transferred to a sterilized tube. Centrifugation was performed to obtain a 
supernatant which was diluted 100 times to obtain a solution. An aliquot (20 u.l) of the solution was mixed with 200 u.l 
of a culture solution of Escherichia coli XL1-Blue having been cultivated to obtain a value of OD 600 of 1.0 to cause 
infection at 37 0 C for 15 minutes. Aliquots (1 to 100 uJ) of the culture solution were plated on LB plates containing 
ampicillin, followed by cultivation at 37 °C overnight. Appeared colonies were randomly selected. Selected colonies 

35 were added with glycerol, and they were stored at -80 °C. 

(6) Preparation of plasmid 

Plasmids were prepared by using Magic Mini-prep kit produced by Promega. The culture fluid of Escherichia coli 

40 harboring the plasmid having been stored at -80 °C was inoculated into 5 ml of 2 x YT medium, followed by cultivation 
at 37 °C overnight. Centrifugation was performed for 5 minutes (4,000 rpm, 4 °C), and a supernatant was removed by 
decantation. An obtained bacterial cell pellet was added with 1 ml of TE buffer, followed by a vortex treatment. An 
obtained bacterial cell suspension was transferred to an Eppendorf tube, followed by centrifugation for 5 minutes (5,000 
rpm, 4° C). A resultant supernatant was removed by decantation. 

45 An obtained bacterial cell pellet was added with 300 u.l of Cell Resuspension Solution, and it was sufficiently 

suspended therein. An obtained suspension was transferred to an Eppendorf tube. The suspension was agitated for 
2 minutes with a mixer, to which 300 u.l of Cell Lysis Solution was added, followed by agitation until the suspension 
became transparent. Neutralization Solution (300 u.l) was added thereto, and agitation was performed by shaking with 
the hand, followed by centrifugation for 10 minutes (15,000 rpm). 

so Only an obtained supernatant was transferred to a new Eppendorf tube (1 .5 ml). A suction tube was prepared, to 

which a cock, a miniature column and a syringe (injector) were connected in this order. A resin in an amount of 1 ml 
was charged into the syringe. The supernatant was poured into the syringe, and agitation was performed well, followed 
by suction. Column Washing Solution in an amount of 2 ml was added, and washing was performed while performing 
suction. Suction was continued for 1 to 2 minutes in order to dry up. The miniature column was removed from the 

5 5 equipment, and it was set in a new Eppendorf tube (1.5 ml). Sterilized water in an amount of 100uJ having been warmed 
at 65 to 70 °C was poured into the miniature column, and the column and the Eppendorf tube were centrif uged together 
for 1 minute (5,000 rpm). An eluted solution was transferred to an Eppendorf tube, to which 5 uJ of 3 M sodium acetate 
aqueous solution was added, and 250 uJ of cold ethanol was added thereto. The solution was centrifuged (1 5,000 rpm, 
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25 minutes), and a supernatant was discarded. An obtained precipitate was added with 1 ml of 70 % ethanol, followed 
by centrifugation again (15,000 rpm, 3 minutes). Ethanol was completely removed, and the tube was vacuum-dried in 
a desiccator. The precipitate was sufficiently dissolved in 20 u.l of sterilized water, and an obtained solution was stored 
at -20 °C. An aliquot {1 ul) of the solution was dispensed, and it was subjected to electrophoresis together with volume 
markers to quantitatively determine the plasmid DNA. 

<4> Determination of nucleotide sequence of cDNA and homology search with gene data base 

(1) Determination of nucleotide sequence of cDNA 

The nucleotide sequence of cDNA was analyzed by using DNA automatic sequencer 373A produced by Applied 
Biosystems Inc. (ABI). The sequencing reaction was performed in accordance with an attached manual by using T3 
primer based on the use of Dye Primer Cycle Sequencing Kit produced by the same company. The nucleotide sequence 
was determined for about 750 clones which were randomly selected. 

(2) Homology search 

Partial sequences of about 750 clones were searched with a computer using BlastX. As a result, three clones 
appeared to be homologues of bacterial cellulose synthase subunit. Therefore, it was tried to isolate full length clones. 

<5> Isolation of full length clones 

(1) S'-RACE 

25 As a result of the homology search, the obtained homologue clones were found to be partial length clones. There- 

fore, primers were synthesized to make elongation toward the 5' upstream so that RT-PCR was performed by using 
mRNA as a template. 



15 



20 



30 



(1-a) Synthesis of first-strand DNA 

The following solution was prepared and mixed in a tube. 



0.5 uJ 1 0 U-mol gene-specific primer 1 ; 

1 pg total RNA; 

05 DEPC-treated H 2 0 (adjusted to give a total amount of 9 ul). 

The following oligonucleotides were used as the gene-specific primer 1. That is, an oligonucleotide having a nu- 
cleotide sequence shown in SEQ ID NO: 15 was used for PcsA1. An oligonucleotide having a nucleotide sequence 
shown in SEQ ID NO: 16 was used for PcsA2. An oligonucleotide having a nucleotide sequence shown in SEQ ID NO: 
40 1 7 was used for PcsA3. 

The reaction solution was gently mixed, and then it was centrifuged under a mild condition to allow the reaction 
solution to fall to the bottom of the tube. The solution was left to stand at 70 °C for 10 minutes, followed by immediate 
cooling on ice. 

Next, the following solution was prepared and mixed in the tube. 

45 

5 x RT Buffer 5 p1; 
25 mM MgCI 2 2.5 uJ; 

2 mM dNTP mix 5 ul; 
0.1 M DTT 2.5 ul; 

50 H 2 0 (added to give a total amount of 24 u.l). 

The solution was gently agitated, and then it was centrifuged under a mild condition to allow the reaction solution 
to fall to the bottom of the tube, followed by being left to stand at 42 °C for 1 minute. The solution was added with 1 \i\ 
of SuperScriptll RT (reverse transcriptase, GIBCO BRL), and it was gently mixed. After that, the reaction was performed 
55 at 42 °C for 50 minutes. Subsequently, the reaction solution was left to stand at 70 °C for 15 minutes to stop the 
reaction. Centrifugation was performed under a mild condition to allow the reaction solution to fall to the bottom of the 
tube, followed by being left to stand at 37 °C. RNase H (produced by Toyobo) in an amount of 1 uJ was added thereto 
to perform a reaction at 37 °C for 30 minutes. 
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Subsequently, in order to remove excessive primers and nucleotides contained in the reaction solution, gel filtration 
was performed by using a purification column produced by Boehringer, Quick Spin Columns. At first, the tip of the 
column was removed, followed by centrifugation at 1 , 100 x g for 2 minutes to discard the buffer. The reaction solution 
was introduced into the central area of the column, followed by centrifugation at 1 ,100 x g for 4 minutes to recover the 
solution. 

(1-b) PolWdC) tailing 

An aliquot (5 u,t) was dispensed from the obtained solution, to which the following solution was added. 

5 jliI 5 x CoCI 2 Buffer; 
2.5 uJ 2 mM dCTP; 

H 2 0 (adjusted to give a total amount of 24 uJ). 

The reaction solution was mixed well, and it was left to stand at 94 °C for 3 minutes. Centrifugation was performed 
under a mild condition to allow the reaction solution to fall to the bottom of the tube, followed by being left to stand on 
ice. Terminal transferase TdT (produced by Toyobo) was added thereto in an amount of 1 u.l, followed by mixing under 
a mild condition to perform a reaction at 37 °C for 10 minutes. Subsequently, the reaction solution was left to stand at 
65 °C for 10 minutes to stop the reaction. 

(1-c) PCR reaction 

An aliquot (2.5 uJ) was dispensed from the reaction solution, to which the following solution was added. 

2.5 jai 10 x PCR Buffer; 

2.5 jul 2 mM dNTP mix; 

0.5 uJ Gene-specific primer 2; 

0.5 uJ Abridged Anchor Primer (GIBCO BRL); 

0.5 ul Advantage Klentaq Polymerase Mix (Clontech); 

H 2 0 (adjusted to give a total amount of 25 |il). 

The following oligonucleotides were used as Gene-specific primer 2. That is, an oligonucleotide having a nucleotide 
sequence shown in SEQ ID NO: 18 was used for PcsA1. An oligonucleotide having a nucleotide sequence shown in 
SEQ ID NO: 19 was used for PcsA2. An oligonucleotide having a nucleotide sequence shown in SEQ ID NO: 20 was 
used for PcsA3. 

The solution was introduced into a 0.2 ml tube to perform the PCR reaction under the following condition. 



PAD 


94 °C 


90 seconds 


30 cycles 


_94°C 


30 seconds 




60 to 68 °C 


30 to 60 seconds 




68 °C 


1 80 seconds 


Final 


68 °C 


7 minutes 


Hold 


4°C 





The reaction solution was subjected to agarose gel electrophoresis to extract, from the gel, DNA's corresponding 
to portions having the largest size (about 1.8 K for PcsA1, about 2 K for PcsA2, and about 2.2 K for PcsA3). GENO- 
BIND produced by CLONTECH was used for the extraction, and the procedure was carried out in accordance with its 
protocol. The DNA thus obtained was subjected to Poly(dC)tailing, which was used as a template to perform the PCR 
reaction. The condition and the composition of the reaction solution were the same as those described above. 

(2) Cloning 

(2-a) S'-RACE TA cloning 

Starting from the obtained PCR reaction solution, cloning was performed by using TA Cloning Kit produced by 
Invitrogen in accordance with its protocol. 

The following solution was added to an aliquot (1 .5 uJ) of the PCR reaction solution obtained as described above. 
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0.5 |il 10 x Ligation Buffer: 
1 uJ pCRM vector; 
0.5 ul T4 DNA Ligase; 
1.5 uJ dH 2 0. 

5 

The reaction was performed at 14 °C overnight. An aliquot (2 p1) of the reaction solution was added to 25 p1 of 
Escherichia coli competent cell (JM109) preparation, followed by being left to stand for 30 minutes on ice. After that, 
heat shock was applied at 42 °C for 30 seconds. The solution was stationary left to stand on ice for 2 minutes, to 
which 450 [it of SOB medium was thereafter added to perform cultivation at 37 °C for 1 hour with shaking at 200 rpm. 
10 The culture was spread over Amp/Xgal/IPTG plate, followed by incubation at 37 °C overnight. The plasmid was ex- 
tracted from obtained colonies in accordance with the method as described above. 

(2-b) Cloning of complete length cDNA 

is The procedure was carried out by using DNA Sequencer 377 produced by ABI in accordance with its protocol. 

The sequencing reaction was performed by using M1 3 primer and synthetic oligomer as primers, based on the use of 
Dye Terminater Cycle Sequencing Kit produced by the same company. As a result of the sequencing, as for PcsA3, it 
was revealed that another clone also belonging to the group of PcsA3 but having a slightly different sequence (one 
position for amino acid) was isolated (see Figs. 3 and 4). A nucleotide sequence of a clone (PcsA3-682) containing 

20 the 3'-side region of PcsA3 and an amino acid sequence deduced from this nucleotide sequence are shown in SEQ 
ID NOs: 5 and 6. A nucleotide sequence of a 5'-portion (PcsA3-5') of another clone containing the 5'-side region of 
PcsA3 and an amino acid sequence deduced from this nucleotide sequence are shown in SEQ ID NOs: 7 and 8. A 
nucleotide sequence of a 3'-portion (PcsA3-3') of the clone and an amino acid sequence deduced from this nucleotide 
sequence are shown in SEQ ID NOs: 9 and 10. 

25 As for PcsAI and PcsA2, primers for S'-terminal and 3'-terminal of a region containing ORF were synthesized on 

the basis of the obtained sequences to perform the PCR reaction. Thus, complete length clones were isolated by 
means of TA cloning. The condition and the composition of the reaction solution were the same as those described 
above. 

Oligonucleotides shown in SEQ ID NO: 21 (5'-terminal) and SEQ ID NO: 22 (3'-terminal) were used as the primers 
30 for PcsAI. Oligonucleotides shown in SEQ ID NO: 23 (5'-terminal) and SEQ ID NO: 24 (3' -terminal) were used as the 
primers for PcsA2. Results are shown in SEQ ID NOs: 1 to 4. 
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Ann x to th description 

SEQUENCE LISTING 

(1) GENERAL INFORMATION : 

(±) APPLICANT: NISSHINBO INDUSTRIES, INC. 

HAYASHI, Takahisa 
(ii) TITLE OF INVENTION: CELLULOSE SYNTHASE GENE 
(ill) NUMBER OF SEQUENCES: 24 
(iv) CORRESPONDENCE ADDRESS: 
(A) ADDRESSEE: 
is (B) STREET: 

(C) CITY: 

(E) COUNTRY: 

(F) ZIP: 
(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC oon^atible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 
(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION : 
(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 9-83133 

(B) FILING DATE: 1 -APR- 1997 
(viii) ATTORNEY/AGENT INFORMATION : 

(A) NAME: 

(B) REGISTRATION NUMBER: 

40 (ix) TELECCI-I^UNICATION INFORMATION : 

(A) TELEPHONE: 

(B) TELEFAX: 

45 (2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3207 base pairs 

(B) TYPE: nucleic acid 

50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
(vi) ORIGINAL SOURCE: 
55 (A) ORGANISM: Gossypium hirsutum L. 
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(C) INDIVIDUAL ISOLATE: Coker312 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION ; 77 . .3001 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GGTTAGCATA TTGTTTOTAG CATTGGGTTT TTTTCTCAAG GAAGAAGAAG GAGAAAGATA 60 
ACTAATCTTT TTGAGA ATG ATG GAA TCT GGG GTT OCT GTT TGC CAC ACT 109 

Met: Met: Glu Ser Gly Val Pro Val Cys His Thr 
15 10 
TGT OCT GAA CAT GTT GGG TTG AAT GTT AAT GGT GAA CCC TTT GTG GCT 157 
Cys Gly Glu His Val Gly Leu Asn Val Asn Gly Glu Pro Phe Val Ala 

15 20 25 

TGC CAT GAA TGT AAT TIC OCT ATT TGT AAG AGT TGT TTT GAG TAT GAT 205 
Cys His Glu Cys Asn Phe Pro lie Cys Lys Ser Cys Phe Glu Tyr Asp 

30 35 40 

CIT AAG GAA GGA CAA AAA GCT TGC TTG OCT TGT GGT ATT COG TAT GAT 253 
Leu Lys Glu Gly Gin Lys Ala Cys Leu Arg Cys Gly lie Pro Tyr Asp 

45 50 55 

GAA AAC CTG TTG GAC GAT GTC GAG AAG GCC ACC GGC GAT CAA TOG ACA 301 
25 Glu Asn Leu Leu Asp Asp Val Glu Lys Ala Thr Gly Asp Gin Ser Thr 
60 65 70 75 

ATG GCT GGA CAT TTG AGC AAG TCT CAG GAT GTT GGA ATT CAT GCA AGA 349 
Met: Ala Ala His Leu Ser Lys Ser Gin Asp Val Gly lie His Ala Arg 
30 80 85 90 

CAT ATC AGC AGT GTG TCT ACA TTG GAT ACT GAA ATG ACT GAA GAC AAT 397 
His lie Ser Ser Val Ser Thr Leu Asp Ser Glu Met Thr Glu Asp Asn 

95 100 105 

35 GGG AAT COG ATT TGG AAG AAC AGG GTG GAA AGT TOG AAA GAA AAG AAG 445 

Gly Asn Pro lie Trp Lys Asn Arg Val Glu Ser Trp Lys Glu Lys Lys 

110 115 120 

AAC AAG AAG AAG AAG OCT GCA ACA ACT AAG GTT GAA AGA GAG GCT GAA 493 
40 Asn Lys Lys Lys Lys Pro Ala Thr Thr Lys Val Glu Arg Glu Ala Glu 

125 130 135 

ATC CCA OCT GAG CAA CAA ATG GAA GAT AAA COG GCA COG GAT GCT TCC 541 
lie Pro Pro Glu Gin Gin Met Glu Asp Lys Pro Ala Pro Asp Ala Ser 
45 140 145 150 155 

CAG CCC CTC TOG ACT ATA ATT CCA ATC COG AAA AGC AGA CIT GCA CCA 589 
Gin Pro Leu Ser Thr lie lie Pro lie Pro Lys Ser Arg Leu Ala Pro 

160 165 170 

50 TAC GGA ACC GTG ATC ATT ATG 0GA TIG ATC ATT CTC GCT CTT TTC TTC 637 
Tyr Arg Thr Val lie lie Met Arg Leu lie lie Leu Gly Leu Phe Phe 
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175 180 185 

CAT TAT CGA GTA ACA AAC COC GTT GAC ACT OCT TTT GGA CTG TGG CTC 685 
His Tyr Arg Val Thr Asn Pro Val Asp Ser Ala Phe Gly Leu Trp Leu 

190 195 200 

ACT TCA GTC ATA TGT GAA ATC TGG TTT GCT TTT TOC TGG GTG TTG GAT 733 

Thr Ser Val lie Cys Glu lie Trp Phe Ala Phe Ser Trp Val Leu Asp 

205 210 215 

CAG TTC OCT AAG TGG TAT OCT GTT AAC AGG GAA ACA TAC ATT GAC AGA 781 
Gin Phe Pro Lys Trp Tyr Pro Val Asn Arg Glu Thr Tyr lie Asp Arg 
220 225 230 235 

CTG TCT GCA AGA TAT GAA AGA GAA GGT GAA OCT AAT GAA CTT GCT GCA 829 
Leu Ser Ala Arg Tyr Glu Arg Glu Gly Glu Pro Asn Glu Leu Ala Ala 

240 245 250 

GTT GAC TTC TTT CTG ACT ACA CTG GAT OCA TTG AAA GAG OCT OCA TTG 877 
Val Asp Fte Phe Val Ser Thr Val Asp Pro Leu Lys Glu Pro Pro Leu 
2 o 255 260 265 

ATT ACT G0C AAT ACT CTG CTT TCC ATC CTT GCC TTG GAC TAC COG GTA 925 
lie Thr Ala Asn Thr Val Leu Ser lie Leu Ala Leu Asp Tyr Pro Val 
270 275 280 

25 GAT AAG GTC TCT TCT TAT ATA TCT GAT GAT GGT GOG G0C ATG CTG ACA 973 

Asp Lys Val Ser Cys Tyr lie Ser Asp Asp Gly Ala Ala Met Leu Thr 

285 290 295 

TTT GAA TCT CTA GTA GAA ACA GCC GAC TTT GCA AGA AAG TGG CTT CCA 1021 

30 phe Glu Ser Leu Val Glu Thr Ala Asp Phe Ala Arg Lys Trp Val Pro 

300 305 310 315 

TTC TGC AAA AAA TTT TCC ATT GAA OCA OGG GCA OCT GAG TTT TAC TTC 1069 
Phe Cys Lys Lys Phe Ser lie Glu Pro Arg Ala Pro Glu Phe Tyr Phe 
35 320 325 330 

TCA CAG AAG ATT GAT TAC TTG AAA GAT AAA GTG CAG CCC TCT TTT CTA 1117 
Ser Gin Lys lie Asp Tyr Leu Lys Asp Lys Val Gin Pro Ser Phe Val 

335 340 345 

40 AAA GAA OCT AGA GCT ATG AAA AGA GAT TAC GAA GAG TAC AAA ATT CGA 1165 

Lys Glu Axg Arg Ala Met Lys Arg Asp Tyr Glu Glu Tyr Lys lie Arg 

350 355 360 

ATC AAT GCT TTA GTT GCA AAG GCT CAG AAA ACA OCT GAA GAA GGA TGG 1213 
45 lie Asn Ala Leu Val Ala Lys Ala Gin Lys Thr Pro Glu Glu Gly Trp 

365 370 375 

ACA ATG CAA GAT GGA ACT OCT TGG COG GGA AAT AAC COG OCT GAT CAC 1261 
Thr Met Gin Asp Gly Thr Pro Trp Pro Gly Asn Asn Pro Arg Asp His 
380 385 390 395 

OCT GGC ATG ATT CAG GTT TTC CTT GGA TAT AGC GCT GCT CAT GAC ATC 1309 
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Pro Gly Met lie Gin Val Phe Leu Gly Tyr Ser Gly Ala His Asp lie 

400 405 410 

GAA GGA AAT GAA CTT COC OGA CTG GTT TAG GTC TCT AGA GAG AAG AGA 1357 
Glu Gly Asn Glu Leu Pro Arg Leu Val Tyr Val Ser Arg Glu Lys Arg 

415 420 425 

OCT QQC TAG GAA CAC CAC AAA AAG OCT GGT OCT GAA AAT OCT TTC GTT 1405 
Pro Gly Tyr Gin His His Lys Lys Ala Gly Ala Glu Asn Ala Leu Val 

430 435 440 

AGG GTG TCT OCA CTT CTT ACA AAT GCT COC TTC ATC CTC AAT CTT GAT 1453 
Arg Val Ser Ala Val Leu Thr Asn Ala Pro Pte lie Leu Asn Leu Asp 

445 450 455 

TCT GAC CAC TAT CTT AAC AAT AGC AAG GCA GTT AGG GAG GCA ATG TGC 1501 
Cys Asp His Tyr Val Asn Asn Ser Lys Ala Val Arg Glu Ala Met: Cys 
460 465 470 475 

TTC TTG ATG GAC CCA CAA GTC GCT OGA GAT GTC TGC TAT GTG CAG TIT 1549 
Hie Leu Met Asp Pro Gin Val Gly Arg Asp Val Cys Tyr Val Gin Phe 

460 485 490 

OCT CAA AGA TTT GAT GGC ATA GAT AGG ACT GAT OGA TAT GCC AAT CGG 1597 
Pro Gin Arg Phe Asp Gly lie Asp Arg Ser Asp Arg Tyr Ala Asn Arg 
25 495 500 505 

AAC ACA CTT TTC TTT GAT GTT AAC ATG AAA GGT CTT GAT GGA ATC CAA 1645 

Asn Thr Val Phe Phe Asp Val Asn Met Lys Gly Leu Asp Gly lie Gin 
510 515 520 

30 GGG OCT GTT TAT GTG GGA ACA GOT TOT GTT TTC AAT AGG CAA GCA CTT 1693 

Gly Pro Val Tyr Val Gly Thr Gly Cys Val Phe Asn Arg Gin Ala Leu 

525 530 535 

TAT GGC TAT GGT CCA OCT TCA ATG OCA ACT TTT COC AAG TCA TOO TCC 1741 
35 Tyr Gly Tyr Gly Pro Pro Ser Met Pro Ser Pte Pro Lys Ser Ser Ser 

540 545 550 555 

TCA TCT TGC TOG TCT TGC TGC COC GGC AAG AAG GAA OCT AAA GAT CCA 1789 
Ser Ser Cys Ser Cys Cys Cys Pro Gly Lys Lys Glu Pro Lys Asp Pro 
40 560 565 570 

TCA GAG CTT TAT AGG GAT GCA AAA OGG GAA GAA CTT GAT GCT GCC ATC 1837 
Ser Glu Leu Tyr Arg Asp Ala Lys Arg Glu Glu Leu Asp Ala Ala lie 

575 580 585 

45 TTT AAC CTT AGG GAA ATT GAC AAT TAT GAT GAG TAT GAA AGA TCA ATG 1885 

Phe Asn Leu Arg Glu lie Asp Asn Tyr Asp Glu Tyr Glu Arg Ser Met 

590 595 600 

TTC ATC TCT CAA ACA AGC TTT GAG AAA ACT TTT GGC TTA TCT TCA GTC 1933 
Leu lie Ser Gin Thr Ser Phe Glu Lys Thr Phe Gly Leu Ser Ser Val 
605 610 615 
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TO ATT GAA TCT ACA CTA ATG GAG AAT GGA GGA GTG GCT GAA TCT GCC 1981 
me lie Glu Ser Thr Leu Met Glu Asn Gly Gly Val Ala Glu Ser Ala 
620 625 630 635 

AAC OCT TOC ACA CTA ATC AAG GAA GCA ATT CAT CTC ATC GQC TGT GGC 2029 
Asn Pro Ser Thr Leu He Lys Glu Ala He His Val He Gly Cys Gly 

640 645 650 

TAT GAG GAG AAG ACT GCA TGG GGG AAA GAG ATT GGA TOG ATA TAT GGT 2077 
Tyr Glu Glu Lys Thr Ala Trp Gly Lys Glu He Gly Trp He Tyr Gly 

655 660 665 

TCA GTC ACT GAG GAT ATC TTA ACC GGC TTC AAA ATG CAC TOC CGA GGA 2125 

Ser Val Tfir Glu Asp He Leu Thr Gly Phe Lys Met His Cys Arg Gly 

670 675 680 

TGG AGA TOG ATT TAC TGC ATG COC TTA AGG CCA GCA TTC AAA GGA TCT 2173 
Trp Arg Ser He Tyr Cys Met Pro Leu Arg Pro Ala Hie Lys Gly Ser 

685 690 695 

GCA COC ATC AAT CTG TCT GAT COG TTC CAC CAG GOT CTT CGA TGG GCT 2221 
Ala Pro He Asn Leu Ser Asp Arg Leu His Gin Val leu Arg Trp Ala 
700 705 710 715 

CTT GGA TCT GIT GAA ATT TTC CTA AGC AGG CAT TGC OCT CTA TGG TAT 2269 
Leu Gly Ser Val Glu He Phe Leu Ser Arg His Cys Pro Leu Trp Tyr 

720 725 730 

GGC TTT GGA GGT GGT CGT CIT AAA TGG CTT CAA AGA CTA GCA TAT ATA 2317 
Gly Phe Gly Gly Gly Arg Leu Lys Trp Leu Gin Arg Leu Ala Tyr He 
30 735 740 745 

AAC ACC ATT GTC TAT OCT TTC ACA TOC CTT OCA CTC ATT GOC TAT TGT 2365 
Asn Thr He Val Tyr Pro Phe Thr Ser Leu Pro Leu He Ala Tyr Cys 
750 755 760 

35 TCA CTA CCA GCA ATC TGT CTT CTC ACA GGA AAA TTT ATC ATA CCA AGG 2413 

Ser Leu Pro Ala He Cys Leu Leu Thr Gly Lys Phe He He Pro Thr 

765 770 775 

CTC TCA AAC CTG GCA AGT CTT CTC TTT CTT GGC CTT TTC CTT TOC ATT 2461 
Leu Ser Asn Leu Ala Ser Val Leu Phe Leu Gly Leu Phe Leu Ser He 
780 785 790 795 

ATC GTG ACT GCT GOT CTC GAG CTC CGA TGG ACT GCT GTC AGC ATT GAG 2509 
He Val Thr Ala Val Leu Glu Leu Arg Trp Ser Gly Val Ser He Glu 

800 805 810 

GAC TTA TGG OCT AAC GAG CAG TTT TGG GTC ATC GGT GGC CTT TCA GOC 2557 
Asp Leu Trp Arg Asn Glu Gin Phe Trp Val He Gly Gly Val Ser Ala 

815 820 825 

CAT CTC TTT GOC CTC TTC CAA GCT TTC CTT AAG ATG CTT GOG GGC ATT 2605 
His Leu Phe Ala Val Phe Gin Gly Phe Leu Lys Met Leu Ala Gly He 
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830 835 840 

GAC ACC AAC TTT ACT GTC ACT GCC AAA GCA OCT GAT GAT GCA GAT TTT 2653 

Asp Thr Asn Phe Thr Val Thr Ala Lys Ala Ala Asp Asp Ala Asp Phe 

845 850 855 

OCT GAG CTC TAC ATT GTG AAA TGG ACT ACA CTT CTA ATC OCT CCA ACA 2701 
Gly Glu Leu Tyr lie Val Lys Trp Thr Thr Leu Leu lie Pro Pro Thr 
860 865 870 875 

ACA CTC CTC ATC GTC AAC ATG GTT GGT GTC GTT GOC GGA TTC TCC GAT 2749 
Thr Leu Leu He Val Asn Met: Val Gly Val Val Ala Gly Phe Ser Asp 

880 885 890 

GCC CTC AAC AAA GGG TAC GAA GCT TGG GGA OCA CTC TTT GOC AAA GTG 2797 
15 Ala Leu Asn Lys Gly Tyr Glu Ala Trp Gly Pro Leu Fhe Gly Lys Val 

895 900 905 

TTC TTT TCC TTC TGG GTC ATC CTC CAT CTT TAT CCA TTC CTC AAA GCT 2845 
Phe Phe Ser Phe Trp Val He Leu His Leu Tyr Pro Phe Leu Lys Gly 
20 910 915 920 

CTT ATG GGA CGC CAA AAC AGG ACA OCA ACC ATT GTT GTC CTT TGG TCA 2893 
Leu Met: Gly Arg Gin Asn Arg Thr Pro Thr He Val Val Leu Trp Ser 
925 930 935 

25 GTG TTG TTG GCT TCT GTC TTC TCT CTT GTT TGG GTT CGG ATC AAC GOG 2941 

Val Leu Leu Ala Ser Val Phe Ser Leu Val Trp Val Arg He Asn Pro 
940 945 950 955 

TTT GTC AGC AOC GCC GAT AGC ACC ACC GTG TCA CAG AGC TGC ATT TCC 2989 
Phe Val Ser Thr Ala Asp Ser Thr Thr Val Ser Gin Ser Cys He Ser 

960 965 970 

ATT GAT TGT TGATGATATT ATGTGTTTCT TAGAATTGAA ATCATIGCAA 3038 
He Asp Cys 

CTAAGTGGAC TGAAACATGT CTATTGACTA AGTTTTGAAC AGTTTGTACC CATTTTATTC 3098 
TTAGCAGTCT CTAATTTTOC TAAACAATGC TATGAACTAT ACATATTTCA TTGATATTTA 3158 
CATTAAATGA AACTACATCA GTCTGCAGAA AAAAAAAAAA AAAAAAAAA 3207 
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(2) INFORMATION FOR SEQ ID NO: 2: 
(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 974 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 2: 
Met Met Glu Ser Gly Val Pro Val Cys His Thr Cys Gly Glu His Val 

15 10 15 

Gly Leu Asn Val Asn Gly Glu Pro Phe Val Ala Cys His Glu Cys Asn 
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20 25 30 

Phe Pro lie Cys Lys Ser Cys Phe Glu Tyr Asp Leu Lys Glu Gly Gin 

35 40 45 

Lys Ala Cys Leu Arg Cys Gly lie Pro Tyr Asp Glu Asn Leu Leu Asp 

50 55 60 

Asp Val Glu Lys Ala Thr Gly Asp Gin Ser Thr Met Ala Ala His Leu 

65 70 75 80 

Ser Lys Ser Gin Asp Val Gly lie His Ala Arg His lie Ser Ser Val 

85 90 95 

Ser Thr Leu Asp Ser Glu Met Thr Glu Asp Asn Gly Asn Pro lie Trp 

100 105 110 

Lys Asn Arg Val Glu Ser Trp Lys Glu Lys Lys Asn Lys Lys Lys Lys 

115 120 125 

Pro Ala Thr Thr Lys Val Glu Arg Glu Ala Glu lie Pro Pro Glu Gin 

130 135 140 

Gin Met Glu Asp Lys Pro Ala Pro Asp Ala Ser Gin Pro Leu Ser Thr 
145 150 155 160 

lie lie Pro lie Pro Lys Ser Arg Leu Ala Pro Tyr Arg Thr Val lie 

165 170 175 

lie Met Arg Leu lie lie Leu Gly Leu Phe Phe His Tyr Arg Val Thr 

180 185 190 

Asn Pro Val Asp Ser Ala Phe Gly Leu Trp Leu Thr Ser Val lie Cys 

195 200 205 

Glu lie Trp Phe Ala Phe Ser Trp Val Leu Asp Gin Phe Pro Lys Trp 

210 215 220 

Tyr Pro Val Asn Arg Glu Thr Tyr lie Asp Arg Leu Ser Ala Arg Tyr 
225 230 235 240 

Glu Arg Glu Gly Glu Pro Asn Glu Leu Ala Ala Val Asp Phe Phe Val 

245 250 255 

Ser Thr Val Asp Pro Leu Lys Glu Pro Pro Leu lie Thr Ala Asn Thr 

260 265 270 

Val Leu Ser lie Leu Ala Leu Asp Tyr Pro Val Asp Lys Val Ser Cys 

275 280 285 

Tyr lie Ser Asp Asp Gly Ala Ala Met Leu Thr Phe Glu Ser Leu Val 

290 295 300 

Glu Thr Ala Asp Phe Ala Arg Lys Trp Val Pro Phe Cys Lys Lys Phe 
305 310 315 320 

Ser lie Glu Pro Arg Ala Pro Glu Phe Tyr phe Ser Gin Lys lie Asp 

325 330 335 

Tyr Leu Lys Asp Lys Val Gin Pro Ser Phe Val Lys Glu Arg Arg Ala 

340 345 350 
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Met Lys 

Ala Lys 
370 
Thr Pro 
385 

Val Phe 

Pro Arg 

His Lys 

Leu Thr 
450 
Asn Asn 
465 

Gin Val 

Gly lie 

Asp Val 

Gly Thr 
530 
Pro Ser 
545 

Cys Cys 
Asp Ala 
lie Asp 



610 
Leu Met 
625 

lie Lys 
Ala Trp 
lie Leu 



Arg Asp Tyr Glu Glu Tyr Lys He Arg He Asn Ala Leu Val 

355 360 365 

Ala Gin Lys Thr Pro Glu Glu Gly Trp Thr Met Gin Asp Gly 

375 380 
Trp Pro Gly Asn Asn Pro Arg Asp His Pro Gly Met He Gin 

390 395 400 

Leu Gly Tyr Ser Gly Ala His Asp He Glu Gly Asn Glu Leu 

405 410 415 

Leu Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Tyr Gin His 

420 425 430 

Lys Ala Gly Ala Glu Asn Ala Leu Val Arg Val Ser Ala Val 
435 440 445 

Asn Ala Pro Phe He Leu Asn Leu Asp Cys Asp His Tyr Val 

455 460 
Ser Lys Ala Val Arg Glu Ala Met Cys Phe Leu Met Asp Pro 

470 475 480 

Gly Arg Asp Val Cys Tyr Val Gin Phe Pro Gin Arg Phe Asp 

485 490 495 

Asp Arg Ser Asp Arg Tyr Ala Asn Arg Asn Thr Val Phe Phe 

500 505 510 

Asn Met Lys Gly Leu Asp Gly He Gin Gly Pro Val Tyr Val 
515 520 525 

Gly Cys Val Phe Asn Arg Gin Ala Leu Tyr Gly Tyr Gly Pro 

535 540 
Met Pro Ser Hie Pro Lys Ser Ser Ser Ser Ser Cys Ser Cys 

550 555 560 

Pro Gly Lys Lys Glu Pro Lys Asp Pro Ser Glu Leu Tyr Arg 

565 570 575 

Lys Arg Glu Glu Leu Asp Ala Ala He Phe Asn Leu Arg Glu 

580 585 590 

Asn Tyr Asp Glu Tyr Glu Arg Ser Met Leu He Ser Gin Thr 
595 600 605 

Glu Lys Thr Hie Gly Leu Ser Ser Val Phe He Glu Ser Thr 

615 620 
Glu Asn Gly Gly Vail Ala Glu Ser Ala Asn Pro Ser Thr Leu 

630 635 640 

Glu Ala He His Val He Gly Cys Gly Tyr Glu Glu Lys Thr 

645 650 655 

Gly Lys Glu He Gly Trp He Tyr Gly Ser Val Thr Glu Asp 

660 665 670 

Thr Gly Hie Lys Met His Cys Arg Gly Trp Arg Ser He Tyr 
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675 680 685 

Cys Met Pro Leu Arg Pro Ala Phe Lys Gly Ser Ala Pro He Asn Leu 
s 690 695 700 

Ser Asp Arg Leu His Gin Val Leu Arg Trp Ala Leu Gly Ser Val Glu 
705 710 715 720 

He Phe Leu Ser Arg His Cys Pro Leu Trp Tyr Gly Phe Gly Gly Gly 
10 725 730 735 

Arg Leu Lys Trp Leu Gin Arg Leu Ala Tyr He Asn Thr He Val Tyr 

740 745 750 

Pro Phe Thr Sex Leu Pro Leu lie Ala Tyr Cys Ser Leu Pro Ala He 
75 755 760 765 

Cys Leu Leu Thr Gly Lys Phe He He Pro Thr Leu Ser Asn Leu Ala 

770 775 780 

Ser Val Leu Phe Lsu Gly Leu Phe Leu Ser He He Val Thr Ala Val 
785 790 795 800 

Leu Glu Leu Arg Trp Ser Gly Val Ser He Glu Asp Leu Trp Arg Asn 

805 810 815 

^ Glu Gin Phe Trp Val He Gly Gly Val Ser Ala His Leu Phe Ala Val 

820 825 830 

Phe Gin Gly Phe Lsu Lys Met Leu Ala Gly He Asp Thr Asn Phe Thr 

835 840 845 

0 Val Thr Ala Lys Ala Ala Asp Asp Ala Asp Phe Gly Glu Leu Tyr He 

850 855 860 

Val Lys Trp Thr Thr Lsu lieu lie Pro Pro Thr Thr Leu Leu He Val 
865 870 875 880 

s Asn Met Val Gly Val Val Ala Gly Phe Ser Asp Ala Leu Asn Lys Gly 

885 890 895 

Tyr Glu Ala Trp Gly Pro Leu Phe Gly Lys Val Phe Phe Ser Phe Trp 

9 °P 905 910 

Val He Leu His Leu Tyr Pro Phe Leu Lys Gly Leu Met Gly Arg Gin 
915 920 925 

Asn Arg Thr Pro Thr lie Val Val Leu Trp ser val Leu Leu Ala Ser 
930 935 94Q 

Val Phe Ser Leu Val Trp Val Arg lie Asn Pro Phe Val Ser Thr Ala 
945 95 0 955 960 

Asp Ser Thr Thr val Ser Gin Ser Cys lie Ser lie Asp Cys 

965 970 

(2) INFORMATION FOR SEQ ID NO: 3: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3311 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: CDNA to mRNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Gossypium hirsutum L. 
(C) INDIVIDUAL ISOLATE: Coker312 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 23.. 3142 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CTTTC33TTCT TTTGGTTTTG CC ATG GCT TCA ADC ACC ATG GCC OCT GGC TTT 52 

Met Ala Ser Thr Thr Met Ala Ala Gly Phe 
15 10 
GGT TCA CTT GCT GTT GAC GAG AAT OGG GGA TCA TOG ACA CAT CAA TCA 100 
Gly Ser Leu Ala Val Asp Glu Asn Arg Gly Ser Ser Thr His Gin Ser 

15 20 25 

TCA AOG AAA ATA TGC AGG GTG TGT GGG GAT AAG ATC GGG CAA AAG GAA 148 
Ser Thr Lys lie Cys Arg Val Cys Gly Asp Lys lie Gly Gin Lys Glu 

30 35 40 

AAC GGA CAA 00G TTC GTG GCT TGT CAT GTC TGT GCT TTC OCG GTT TGC 196 
Asn Gly Gin Pro Phe Val Ala Cys His Val Cys Ala Phe Pro Val Cys 

45 50 55 

OGT OCT TGT TAT GAA TAT GAA AGG AGT GAA GGA AAC CAG TGC TGT OCT 244 
Arg Pro Cys Tyr Glu Tyr Glu Arg Ser Glu Gly Asn Gin Cys Cys Pro 

60 65 70 

CAG TGC AAT ACT CGC TAT AAG OCT CAC AAA GGT AGT OCA AGA ATT TCA 292 
Gin Cys Asn Thr Arg Tyr Lys Arg His Lys Gly Ser Pro Arg lie Ser 

75 , 80 85 90 

GGA GAT GAA GAA GAT GAT TCA GAT CAA GAT GAT TTT GAT GAT GAA TTT 340 
Gly Asp Glu Glu Asp Asp Ser Asp Gin Asp Asp Phe Asp Asp Glu Phe 

95 100 105 

CAG ATT AAG AAC CGC AAG GAT GAC TCC CAT CCA CAA CAT GAA AAT GAG 388 
Gin lie Lys Asn Arg Lys Asp Asp Ser His Pro Gin His Glu Asn Glu 

110 115 120 

46 GAA TAT AAT AAT AAT AAT CAT CAA TGG CAT CGC AAT GGT CAA GCT TTC 436 

Glu Tyr Asn Asn Asn Asn His Gin Trp His Pro Asn Gly Gin Ala Phe 

125 130 135 

TCA CTT GCC GGA AGC ACG GGG GGG AAG GAT TTG GAA GGG GAT AAA GAG 484 
Ser Val Ala Gly Ser Thr Ala Gly Lys Asp Leu Glu Gly Asp Lys Glu 
140 145 150 
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ATT TAC GGA AGC GAA GAA TGG AAA GAA AGA GTT GAG AAA TGG AAA GTC 532 

lie Tyr Gly Ser Glu Glu Trp Lys Glu Arg Val Glu Lys Trp Lys Val 

155 160 165 170 

AGG CAA GAA AAA AGA GGT TTG GTA AGC AAC GAT AAT GGC GGA AAT GAT 580 

Arg Gin Glu Lys Arg Gly Leu Val Ser Asn Asp Asn Gly Gly Asn Asp 

175 180 185 

OCT OCT GAA GAA GAT GAT TAT CTC TTG GCT GAA OCT GGC CAG OCT CPA 628 
Pro Pro Glu Glu Asp Asp Tyr Leu Leu Ala Glu Ala Arg Gin Pro Leu 

190 195 200 

TGG GGA AAA GTG OCA ATT TOG TCA ACT CTG ATA AGC OCT TAC COG ATA 676 
Trp Arg Lys Val Pro lie Ser Ser Ser Leu lie Ser Pro Tyr Arg lie 

205 210 215 

GTC ATC GTC CTC 0GA TTC TTC ATC CTC GCA TTT TTC CTC 0QG TTC COT 724 
Val lie Val Leu Arg Phe Phe lie Leu Ala Phe Phe Leu Arg Phe Arg 

220 225 230 

ATT CTA ACA CCC GCC TAC GAG GCT TAC COG TTA TGG CTA ATC TCT GTC 772 
lie Leu Thr Pro Ala Tyr Asp Ala Tyr Pro Leu Trp Leu lie Ser Val 
235 240 245 250 

ATC TGC GAA GTT TGG TTC G0C TTC T0C TGG ATT CTC GAT CAG TTC OCT 820 

lie Cys Glu Val Trp Phe Ala Phe Ser Trp lie Leu Asp Gin Phe Pro 

255 260 265 

AAA TGG TTC OCT ATT ACT OGC GAA ACT TAC CTC GAT CGC CTC TOC TTG 868 
Lys Trp Phe Pro lie Thr Arg Glu Thr Tyr Leu Asp Arg Leu Ser Leu 

270 275 280 

AGG TTC GAA OCT GAA GGA GAG CCC AAT CAA CTT GGC CCC CTC GAC CTC 916 
Arg Phe Glu Arg Glu Gly Glu Pro Asn Gin Leu Gly Pro Val Asp Val 

285 290 295 

TTC GTC ACT AGC GTT GAC CTT CTC AAG GAA 00C CCC ATC ATA ACC GOC 964 

Phe Val Ser Thr Val Asp Leu Leu Lys Glu Pro Pro lie lie Thr Ala 

300 305 310 

AAC GOG GTT CTA TCG ATC TTG GOC GTC GAT TAC CCG GTC GAG AAA GTG 1012 
Asn Ala Val Leu Ser lie Leu Ala Val Asp Tyr Pro Val Glu Lys Val 
315 320 325 330 

TCT TCT TAT GTG TOG GAC GAT GCT GCT TOC ATG CTT CTT TTC GAT TCG 1060 

Cys Cys Tyr Val Ser Asp Asp Gly Ala Ser Met Leu Leu Phe Asp Ser 

335 340 345 

TTG TCT GAA AOG GCT GAG TTC GOG AGG AGA TGG GTT COG TTT TCT AAG 1108 
Leu Ser Glu Thr Ala Glu Phe Ala Arg Arg Trp Val Pro Phe Cys Lys 

350 355 360 

AAG CAT AAT GTT GAG COC AGG GOG COG GAG TTT TAT TTC AAT GAG AAG 1156 
Lys His Asn Val Glu Pro Arg Ala Pro Glu Phe Tyr Phe Asn Glu Lys 
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365 370 375 

ATT GAT TAT TTC AAG GAC AAG GTC CAT OCT AGC TTT GTT AAA GAA OGG 1204 
lie Asp Tyr Leu Lys Asp Lys Val His Pro Ser Phe Val Lys Glu Arg 

380 385 390 

AGA GOC ATC AAA AGG GAA TAT GAA GAA TTT AAA CTA AGG ATC AAT GCA 1252 
Arg Ala Met Lys Arg Glu Tyr Glu Glu Phe Lys Val Arg lie Asn Ala 
395 400 405 410 

TEA GTA GCA AAA GCT CAG AAG AAA OCA GAA GAA GGA TOG GTC ATG CAA 1300 
Leu Val Ala Lys Ala Gin Lys Lys Pro Glu Glu Gly Trp Val Mel: Gin 

415 420 425 

GAT GOC AOC OCA TOG GOC GGA AAT AAC ACT OCT GAT CAT OCT GGA ATG 1348 

Asp Gly Thr Pro Trp Pro Gly Asn Asn Thr Arg Asp His Pro Gly Met 

430 435 440 

ATT CAG GTC TAT CTA GGA ACT GOC GCT GCA CTC GAT GTG GAT GGC AAA 1396 
lie Gin Val Tyr Leu Gly Ser Ala Gly Ala Leu Asp Val Asp Gly Lys 

445 450 455 

GAG CTG OCT OGA CTT CTC TAT GTT TCT OCT GAG AAA OGA OCT GGT TAT 1444 
Glu Leu Pro Ang leu Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Tyr 
460 465 470 

25 CAG CAC CAT AAG AAA GCC GCT GCT GAG AAT GCT CTG GTT OGA GTT TCT 1492 

Gin His His Lys Lys Ala Gly Ala Glu Asn Ala Leu Val Arg Val Ser 
475 480 485 490 

GCA GTC CTT ACT AAT GCA O0C TTC ATA TTG AAT CTG GAT TCT GAT CAT 1540 
30 Ala Val Leu Thr Asn Ala Pro Phe lie Leu Asn Leu Asp Cys Asp His 

495 500 505 

TAC ATC AAC AAT AGC AAG GCC ATG AGG GAA GOG ATG TGC TTT TTA ATC 1588 
Tyr lie Asn Asn Ser Lys Ala Met Arg Glu Ala Met Cys Phe Leu Met 

510 515 520 

GAT OCT CAG TTT GGA AAG AAG CTT TCT TAT GTT CAA TTT CCA CAG AGA 1636 
Asp Pro Gin Phe Gly Lys Lys Leu Cys Tyr Val Gin Phe Pro Gin Arg 

525 530 535 

TTT GAT GCT ATT GAT OCT CAT GAT OGA TAT GCT AAT OGA AAT GTT GTC 1684 
Phe Asp Gly lie Asp Arg His Asp Arg Tyr Ala Asn Arg Asn Val Val 

540 545 550 

TTC TTT GAT ATC AAC ATC TTC GGA TTA GAT GGA CTT CAA GGC OCT CTA 1732 
Phe Pne Asp lie Asn Met Leu Gly Leu Asp Gly Leu Gin Gly Pro Val 
555 560 565 570 

TAT CTA GGC AGA GGG TCT GTT TTC AAC AGG CAG GCA TTG TAT GGC TAC 1780 
Tyr Val Gly Thr Gly Cys Val Phe Asn Arg Gin Ala Leu Tyr Gly Tyr 

575 580 585 

GAT OCA OCA GTC TCT GAG AAA OGA OCA AAG ATC ACA TCT GAT TGC TOG 1828 
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Asp Pro Pro Val Ser Glu Lys Arg Pro Lys Met Thr Cys Asp Cys Trp 

590 595 600 

OCT TCT TOG TCT TGC TGT TGT TGC GGA GGT TCT AGG AAG AAA TCA AAG 1876 
Pro Ser Trp Cys Cys Cys Cys Cys Gly Gly Ser Arg Lys Lys Ser Lys 

605 610 615 

AAG AAA GCT GAA AAG AAG GGC TTA CTC GGA GGT CTT TTA TAC GGA AAA 1924 
Lys Lys Gly Glu Lys Lys Gly Leu Leu Gly Gly Leu Leu Tyr Gly Lys 

620 625 630 

AAG AAG AAG ATG ATG GGC AAA AAC TAT GTG AAA AAA GGG TCT GCA OCA 1972 

Lys Lys Lys Met Met Gly Lys Asn Tyr Val Lys Lys Gly Ser Ala Pro 

635 640 645 650 

GTC TTT GAT CTC GAA GAA ATC GAA GAA GGG CTT GAA GGA TAC GAA GAA 2020 

Val Phe Asp Leu Glu Glu lie Glu Glu Gly Leu Glu Gly Tyr Glu Glu 

655 660 665 

TTG GAG AAA TOG ACA TTA ATG TOG CAG AAG AAT TIC GAG AAA OGA TTC 2068 
Leu Glu Lys Ser Thr Leu Met Ser Gin Lys Asn Phe Glu Lys Arg Phe 

670 675 680 

GGA CAA TCA 00G GTT TTC ATT G0C TCA ACT TTG ATG GAA AAT GGT GGC 2116 
Gly Gin Ser Pro Val Phe lie Ala Ser Thr Leu Met Glu Asn Gly Gly 

685 690 695 

CTT OCT GAA GGA ACT AAT TCC ACA TCA CTG ATT AAA GAG GCC ATT CAC 2164 
Leu Pro Glu Gly Thr Asn Ser Thr Ser Leu lie Lys Glu Ala lie His 

700 705 710 

GTA ATT AGO TGT GGT TAT GAA GAA AAA ACT GAG TGG GGC AAA GAG ATC 2212 
Val lie Ser Cys Gly Tyr Glu Glu Lys Thr Glu Trp Gly Lys Glu lie 
715 720 725 730 

GGA TGG ATT TAT GGG TOG GTG ACG GAA GAT ATA TTA ACA GGT TTC AAG 2260 
Gly Trp lie Tyr Gly Ser Val Thr Glu Asp lie Leu Thr Gly Phe Lys 

735 740 745 

ATG CAT TGT AGA GGG TGG AAA TOG GTT TAT TGT GTA CCG AAA AGA OCG 2308 
Met His Cys Arg Gly Trp Lys Ser Val Tyr Cys Val Pro Lys Arg Pro 

750 755 760 

GCA TTC AAA GGG TCC OCT CCA ATC AAT CTC TOG GAT CGG TTG CAC CAA 2356 
Ala Pte Lys Gly Ser Ala Pro lie Asn Leu Ser Asp Arg Leu His Gin 

765 770 775 

GTT TTG AGA TGG GCA CTT GGT TCT GTA GAA ATT TTC CTT AGT OCT CAC 2404 

Val Leu Arg Trp Ala Leu Gly Ser Val Glu lie Phe Leu Ser Arg His 

780 785 790 

TGT CCA CTT TGG TAT GGT TAT GGT GGA AAA CTG AAA TGG CTC GAG AGG 2452 
Cys Pro Leu Trp Tyr Gly Tyr Gly Gly Lys Leu Lys Trp Leu Glu Arg 
795 800 805 810 
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CTT GCT TAT ATC AAC ACC ATT GTT TAC OCT 1TC ACC TOG ATC OCT TTA 2500 
Leu Ala Tyr lie Asn Thr lie Val Tyr Pro Phe Thr Ser lie Pro Leu 

815 820 825 

CTC GOC TAT TGT ACT ATT CCA GCT CTT TGT CTT CTC ACC GGC AAA TTC 2548 
Leu Ala Tyr Cys Thr lie Pro Ala Val Cys Leu Leu Thr Gly Lys Phe 

830 835 840 

ATC ATT CCA ACT CTA AGC AAC CTT ACA ACT GTG TGG TIC TTG OCA CTT 2596 

lie lie Pro Thr Leu Ser Asn Leu Thr Ser Val Trp Phe Leu Ala Leu 

845 850 855 

TTC CTC TOC ATC ATT OCA ACT GGA GTG CTT GAA CTT CGA TGG AGC GGG 2644 
Phe Leu Ser lie lie Ala Thr Gly Val Leu Glu Leu Arg Trp Ser Gly 

860 865 870 

CTT AGC ATC CAA GAC TGG TGG CGC AAT GAA CAA TTC TGG GTG ATC GGA 2692 
Val Ser lie Gin Asp Trp Trp Arg Asn Glu Gin Phe Trp Val He Gly 
875 880 885 890 

GCT GTC TCC GOC CAT CTT TTT GCT GTC TTC CAG GGC CTC CTC AAA GTC 2740 
Gly Val Ser Ala His Leu Phe Ala Val Phe Gin Gly Leu Leu Lys Val 

895 900 905 

CTA GCT GGA GTA GAC ACC AAC TTC ACC CTA ACA OCA AAA GCA GCA GAC 2788 
Leu Ala Gly Val Asp Thr Asn Phe Thr Val Thr Ala Lys Ala Ala Asp 

910 915 920 

GAT ACA GAA TTC GGT GAA CTT TAT CTC TTC AAA TGG ACA ACT CTC TTA 2836 
Asp Thr Glu Phe Gly Glu Leu Tyr Leu Phe Lys Trp Thr Thr Leu Leu 

925 930 935 

ATC OCT OCC ACA ACT CTG ATA ATA CTG AAC ATG GTC GGA GTC GTG GOC 2884 
He Pro Pro Thr Thr Leu He He Leu Asn Met Val Gly Val Val Ala 

940 945 950 

GGA CTT TCA GAC GCA ATC AAC AAC GGC TAT GOT TCA TGG GCT CCA TTG 2932 

Gly Val Ser Asp Ala He Asn Asn Gly Tyr Gly Ser Trp Gly Pro Leu 

955 960 965 970 

TTC GGC AAA CTG TTC TTC GCA TTC TGG GTC ATT CTT CAT CTT TAC CCA 2980 

Phe Gly Lys Leu Phe Phe Ala Phe Trp Val He Leu His Leu Tyr Pro 

975 980 985 

TTC CTC AAA GCT TTG ATG GGG AGA CAA AAC AGG ACG CCC ACC ATT GTT 3028 
Phe Leu Lys Gly Leu Met: Gly Arg Gin Asn Arg Thr Pro Thr He Val 

990 995 1000 

GTG CTT TGG TCC ATA CTT TTG GCA TOG ATT TTC TCA CTG GTT TGG GTA 3076 
Val Leu Trp Ser He Leu Leu Ala Ser He Phe Ser Leu Val Trp Val 

1005 1010 1015 

COG ATC GAT CCC TTC TTG OCC AAA GAA ACA GGT CCA GTT CTT AAA CAA 3124 
Arg He Asp Pro Phe Lai Pro Lys Gin Thr Gly Pro Val Leu Lys Gin 
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1020 1025 1030 

TGT GGC GTG GAG TGC TAAATOGTCT TTTACAAACC TTTCTTATTA TTTTATTTTC 3179 
Cys Gly Val Glu Cys 
1035 

CLTlTl-iUUC ACTACTCTTG ATTTGCTCTG ATTCTAAAAG GGATTTATCT TGTTTGTAAA 3239 
AAGTCTOCTA TGATTTTGTT GCTTCAATTT AATTTCTATA TOGTAAAAAA ATATTTCTTT 3299 
AAATTAACTA TA 



(2) INFORMATION FOR SEQ ID NO: 4: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENOTH: 1039 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
Met Ala Ser Thr Thr Mat Ala Ala Gly Hie Gly Sex Leu Ala Val Asp 

1 5 io 15 

Glu Asn Arg Gly Ser Ser Thr His Gin Ser Ser Thr Lys He Cys Arg 

20 25 30 

Val Cys Gly Asp Lys He Gly Gin Lys Glu Asn Gly Gin Pro Phe Val 

35 40 45 

Ala Cys His Val Cys Ala Phe Pro Val Cys Arg Pro Cys Tyr Glu Tyr 

50 55 60 

Glu Arg Ser Glu Gly Asn Gin Cys Cys Pro Gin Cys Asn Thr Arg Tyr 

65 70 75 80 

Lys Arg His Lys Gly Ser Pro Arg He Ser Gly Asp Glu Glu Asp Asp 

85 90 95 

Ser Asp Gin Asp Asp Phe Asp Asp Glu Phe Gin He Lys Asn Arg Lys 

109 105 no 

Asp Asp Ser His Pro Gin His Glu Asn Glu Glu Tyr Asn Asn Asn Asn 

H5 120 125 

His Gin Trp His Pro Asn Gly Gin Ala Phe Ser Val Ala Gly Ser Thr 

130 135 140 

Ala Gly Lys Asp Leu Glu Gly Asp Lys Glu He Tyr Gly Ser Glu Glu 
145 150 155 160 

Trp Lys Glu Arg Val Glu Lys Trp Lys Val Arg Gin Glu Lys Arg Gly 

16 5 170 175 

leu Val Ser Asn Asp Asn Gly Gly Asn Asp Pro Pro Glu Glu Asp Asp 

180 185 190 

Tyr Leu l^u Ala Glu Ala Arg Gin Pro Leu Trp Arg Lys Val Pro He 
195 200 205 
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225 



Ala 



Glu 



305 



Arg 

Lys 
385 
Tyr 

Lys 
Gly 



Tyr 
465 
Gly 



Ala 



Lys 



Ser Ser Leu lie Ser Pro Tyr Arg lie Val lie Val Leu Arg Phe 

210 215 220 

lie Leu Ala Phe Phe Leu Arg Phe Arg lie Leu Thr Pro Ala Tyr 

230 235 240 

Ala Tyr Pro Leu Trp Leu lie Ser Val lie Cys Glu Val Trp Phe 

245 250 255 

Phe Ser Trp lie Leu Asp Gin Phe Pro Lys Trp Phe Pro He Thr 

260 265 270 

Glu Thr Tyr Leu Asp Arg Leu Ser Leu Arg Phe Glu Arg Glu Gly 

275 280 285 

Pro Asn Gin Leu Gly Pro Val Asp Val Phe Val Ser Thr Val Asp 
290 295 300 

Leu Lys Glu Pro Pro He He Thr Ala Asn Ala Val Leu Ser He 

310 315 320 

Ala Val Asp Tyr Pro Val Glu Lys Val Cys Cys Tyr Val Ser Asp 

325 330 335 

Gly Ala Ser Met Leu Leu Phe Asp Ser Leu Ser Glu Thr Ala Glu 

340 345 350 

Ala Arg Arg Trp Val Pro Phe Cys Lys Lys His Asn Val Glu Pro 

355 360 365 

Ala Pro Glu Phe Tyr Hie Asn Glu Lys He Asp Tyr Leu Lys Asp 
370 375 380 

Val His Pro Ser Phe Val Lys Glu Arg Arg Ala Met: Lys Arg Glu 

390 395 400 

Glu Glu Phe Lys Val Arg He Asn Ala Leu Val Ala Lys Ala Gin 

405 410 415 

Lys Pro Glu Glu Gly Trp Val Met Gin Asp Gly Thr Pro Trp Pro 

420 425 430 

Asn Asn Thr Arg Asp His Pro Gly Met He Gin Val Tyr Leu Gly 

435 440 445 

Ala Gly Ala Leu Asp Val Asp Gly Lys Glu Leu Pro Arg Leu Val 
450 455 460 

Val Ser Arg Glu Lys Arg Pro Gly Tyr Gin His His Lys Lys Ala 

470 475 480 

Ala Glu Asn Ala Leu Val Arg Val Ser Ala Val Leu Thr Asn Ala 

485 490 495 

Phe He Leu Asn Leu Asp Cys Asp His Tyr He Asn Asn Ser Lys 

500 505 510 

Met Arg Glu Ala Met Cys Phe Leu Met Asp Pro Gin Phe Gly Lys 

515 520 525 

Leu Cys Tyr Val Gin Phe Pro Gin Arg Phe Asp Gly He Asp Arg 
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530 535 540 

His Asp Arg Tyr Ala Asn Arg Asn Val Val Phe Phe Asp He Asn Met 
545 550 555 560 

Leu Gly Leu Asp Gly Leu Gin Gly Pro Val Tyr Val Gly Thr Gly Cys 

565 570 575 

Val Phe Asn Arg Gin Ala Leu Tyr Gly Tyr Asp Pro Pro Val Ser Glu 

580 585 590 

Lys Arg Pro Lys Met Thr Cys Asp Cys Trp Pro Ser Trp Cys Cys Cys 

595 600 605 

Cys Cys Gly Gly Ser Arg Lys Lys Ser Lys Lys Lys Gly Glu Lys Lys 

610 615 620 

Gly Leu Leu Gly Gly Leu Leu Tyr Gly Lys Lys Lys Lys Met Met Gly 
625 630 635 640 

Lys Asn Tyr Val Lys Lys Gly Ser Ala Pro Val Phe Asp Leu Glu Glu 

645 650 655 

He Glu Glu Gly Leu Glu Gly Tyr Glu Glu Leu Glu Lys Ser Thr Leu 

660 665 670 

Met Ser Gin Lys Asn Phe Glu Lys Arg Phe Gly Gin Ser Pro Val Phe 

675 680 685 

He Ala Ser Thr Leu Met Glu Asn Gly Gly Leu Pro Glu Gly Thr Asn 

690 695 700 

Ser Thr Ser Leu He Lys Glu Ala He His Val He Ser Cys Gly Tyr 
705 710 715 720 

Glu Glu Lys Thr Glu Trp Gly Lys Glu He Gly Trp He Tyr Gly Ser 

725 730 735 

Val Thr Glu Asp He Leu Thr Gly Phe Lys Met His Cys Arg Gly Trp 

740 745 75Q 

Lys Ser Val Tyr Cys Val Pro Lys Arg Pro Ala Phe Lys Gly Ser Ala 

755 760 765 

Pro He Asn Leu Ser Asp Arg Leu His Gin Val Leu Arg Trp Ala Leu 

770 775 780 

Gly Ser Val Glu He Phe Leu Ser Arg His Cys Pro Leu Trp Tyr Gly 
785 790 795 8 00 

Tyr Gly Gly Lys Leu Lys Trp Leu Glu Arg Leu Ala Tyr He Asn Thr 

805 810 815 

He Val Tyr Pro Phe Thr Ser He Pro Leu Leu Ala Tyr Cys Thr He 

820 825 830 

Pro Ala Val Cys Leu Leu Thr Gly Lys Phe He He Pro Thr Leu Ser 
835 840 845 

Asn Leu Thr Ser Val Trp Phe Leu Ala Leu Phe Leu Ser lie He Ala 
850 855 860 
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Thr Gly Val Leu Glu Leu Arg Trp Ser Gly Val Ser lie Gin Asp Trp 
865 870 875 880 

Trp Arg Asn Glu Gin Phe Trp Val lie Gly Gly Val Ser Ala His Leu 

885 890 895 

Phe Ala Val Pfce Gin Gly Leu Leu Lys Val Leu Ala Gly Val Asp Thr 

900 905 910 

Asn Phe Thr Val Thr Ala Lys Ala Ala Asp Asp Thr Glu Phe Gly Glu 

915 920 925 

Leu Tyr Leu Fte Lys Trp Thr Thr Leu Leu lie Pro Pro Thr Thr Leu 

930 935 940 

lie lie Leu Asn Met Val Gly Val Val Ala Gly Val Ser Asp Ala lie 
945 950 955 960 

Asn Asn Gly Tyr Gly Ser Trp Gly Pro Leu Phe Gly Lys Leu Phe Hie 

965 970 975 

Ala Phe Trp Val lie Leu His Leu Tyr Pro Phe Leu Lys Gly Leu Met 

980 985 990 

Gly Arg Gin Asn Arg Thr Pro Thr lie Val Val Leu Trp Ser lie Leu 

995 1000 1005 

Leu Ala Ser lie Phe Ser Leu Val Trp Val Arg lie Asp Pro Phe Leu 

1010 1015 1020 

Pro Lys Gin Thr Gly Pro Val Leu Lys Gin Cys Gly Val Glu Cys 
1025 1030 1035 



(2) INFORMATION FOR SEQ ID NO: 5: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2033 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Gossypium hirsutum L. 

(C) INDIVIDUAL ISOLATE: Coker312 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 1857 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
COG ACA TTC GTG AAG GAG CGT OGA GCT ATG AAG AGA GAA TAT GAA GAA 
Pro Thr Phe Val Lys Glu Arg Arg Ala Met Lys Arg Glu Tyr Glu Glu 

15 10 15 

TTC AAG GTT AGG ATA AAT GCA CTT GTA GCC AAA GCC CAA AAG GTT OCT 
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CCA GAA 
Pro Glu 

ACT AAA 
Thr Lys 
50 

GGC CAT 
Gly His 

65 
OGA GAG 
Arg Glu 

AAC GOC 
Asn Ala 

TTG AAC 
Leu Asn 

GAG GCT 
Glu Ala 
130 
TAT CTC 
Tyr Val 
145 

TAT GOC 
Tyr Ala 

GAT GCT 
Asp Gly 

AGG CAA 
Arg Gin 

AAA ATG 
Lys Met 
210 
GAC AAA 
Asp Lys 
225 



GGG 
Gly 
35 
GAT 
Asp 

GAT 
Asp 

AAA 

Lys 

CTT 

Leu 

TTG 
Leu 
115 
ATG 

Met 

CAA 

Gin 

AAT 

Asn 

ATA 
He 

GCT 
Ala 
195 
GTA 
Val 

AAG 
Lys 



Arg He 

20 
TOG ATC 
Trp He 

CAC OCT 
His Pro 

ACC GAA 
Thr Glu 

AGG OCT 

Arg Pro 
85 

GTT CGG 

Val Arg 
100 

GAT TGT 
Asp Cys 

TOT TTC 
Cys Phe 

TTC OCT 
Phe Pro 

CGG AAC 
Arg Asn 
165 
CAA GGC 
Gin Gly 
180 

CTT TAT 
Leu Tyr 

ACC TGT 
Thr Cys 

CAC TCT 
His Ser 



Asn Ala 

ATG CAA 
Met Gin 

GGT ATG 
Gly Met 
55 

GGA AAT 
Gly Asn 

70 
GGT TTC 
Gly Phe 

GTC TOG 
Val Ser 

GAC CAT 
Asp His 

TTG ATG 
Leu Met 
135 
CAA CGT 
Gin Arg 
150 

ACA GTT 
Thr Val 

OCT GTA 
Pro Val 

GGT TAT 
Gly Tyr 

GGT TGC 
Gly Cys 
215 
AAG GAT 
Lys Asp 
230 



Leu Val Ala Lys Ala Gin Lys Val Pro 

25 30 
GAT GGG ACA OCA TGG OCA GGA AAC AAT 
Asp Gly Thr Pro Trp Pro Gly Asn Asn 

40 45 
ATT CAA GTA TTT CTC GGT CAA ACT GGA 
He Gin Val Phe Leu Gly Gin Ser Gly 

60 

GAG CTT OCT CGT CTC GTC TAT GTA TCT 
Glu Leu Pro Arg Leu Val Tyr Val Ser 

75 80 
TTG CAT CAC AAG AAA GCT GGT GOC ATG 
Leu His His Lys Lys Ala Gly Ala Met 

90 95 
GGG GTG CTC ACA AAT GCT OCT TTT ATG 
Gly Val Leu Thr Asn Ala Pro Phe Met 

105 110 
TAT TTA AAT AAC AGC AAG GCT OTA AGA 
Tyr Leu Asn Asn Ser Lys Ala Val Arg 
120 125 
GAC OCT CAA ATT GGA AGA AAG GTT TGC 
Asp Pro Gin He Gly Arg Lys Val Cys 

140 

TTC GAT GGT ATT GAT AGA CAT GAT OGA 
Phe Asp Gly He Asp Azg His Asp Arg 

155 160 
TTC TTT GAT ATT AAC ATG AAA GGT CTA 

Phe Phe Asp He Asn Met Lys Gly Leu 

170 175 
TAT GTC GGC ACG GGG TGT GTT TTC AGA 
Tyr Val Gly Thr Gly Cys Val Phe Arg 

185 190 
GAA OCT OCA AAG GGA OCT AAG OGC CCG 

Glu Pro Pro Lys Gly Pro Lys Arg Pro 
200 205 
TGC OCT TGT TTT GGA OGC OGC AGA AAG 
Cys Pro Cys Phe Gly Arg Arg Arg Lys 

220 

GGT GGA AAT GCA AAT GGT CTA AGC CTA 

Gly Gly Asn Ala Asn Gly Lai Ser Leu 

235 240 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 
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to 



15 



20 



25 



GAA GCA GOC AAA GAT GAC AAG GAG TTA TTG ATG TOC CAC ATG AAC TTT 768 

Glu Ala Ala Lys Asp Asp Lys Glu Leu Leu Met Ser His Met Asn Phe 

245 250 255 

GAA AAG AAA TTT GGA CAA TCA GOC ATT TTT GTA ACT TCA ACA CTG ATG 816 
Glu Lys Lys Phe Gly Gin Ser Ala lie Phe Val Thr Ser Thr Leu Met 

260 265 270 

GAA CAA GGT GGT GTC OCT OCT TCT TCA AGC COC GCA GCT TTG CTC AAA 864 

Glu Gin Gly Gly Val Pro Pro Ser Ser Ser Pro Ala Ala Leu Zesu Lys 

275 280 285 

GAA GOC ATT CAT GTA ATT ACT TOT GGT TAT GAA GAC AAA ACA GAA TGG 912 
Glu Ala lie His Val lie Ser Cys Gly Tyr Glu Asp Lys Thr Glu Trp 

290 295 300 

GGA AGC GAG CTT GGC TGG ATT TAC GGC TOG ATT ACA GAA GAT ATC TTA 960 
Gly Ser Glu Leu Gly Trp lie Tyr Gly Ser lie Thr Glu Asp lie Leu 
305 310 315 320 

ACA GGA TTC AAG ATG CAT TGC GGT GGA TGG AGA TCA ATA TAC TOC ATG 1008 
Thr Gly Phe Lys Met His Cys Arg Gly Trp Arg Ser lie Tyr Cys Met 

325 330 335 

OCA AAG TTG OCT GCA TTC AAG GGT TCA GCT CCC ATC AAT CTA TOG GAT 1056 
Pro Lys Leu Pro Ala Phe Lys Gly Ser Ala Pro lie Asn Leu Ser Asp 

340 345 350 

OCT CTA AAC CAA CTC CTT GGA TGG GCA CTC GOT TCT GTT GAA ATT TTC 1104 
Arg Leu Asn Gin Val Leu Arg Trp Ala Leu Gly Ser Val Glu lie Phe 
30 355 360 365 

TTT ACT CAT CAT TGC CCA GCA TGG TAT GGT TTC AAG GGA GGA AAG CTA 1152 
Phe Ser His His Cys Pro Ala Trp Tyr Gly Phe Lys Gly Gly Lys Leu 

370 375 380 

AAA TGG CTT GAA OGA TTC GCA TAT GTC AAC ACA ACC ATC TAC CCC TTC 1200 
Lys Trp Leu Glu Arg Phe Ala Tyr Val Asn Thr Thr lie Tyr Pro Phe 
385 390 395 400 

ACA TCT TTA CCA CTT CTC GCC TAT TCT ACC CTA COG GCA ATC TOT TTA 1248 
Thr Ser Leu Pro Leu Leu Ala Tyr Cys Thr Leu Pro Ala lie Cys Leu 

405 410 415 

CTT ACC GAT AAA TTT ATC ATG CCA COG ATA AGC ACC TTT GCA ACT CTA 1296 
Leu Thr Asp Lys Phe lie Met Pro Pro lie Ser Thr Phe Ala Ser Leu 

420 425 430 

TTC TTC ATT GOC TTG TTT CTT TCA ATC TTT GCA ACT GOT ATT CTC GAG 1344 
Phe Phe lie Ala Leu Phe Leu Ser lie Phe Ala Thr Gly lie Leu Glu 

435 440 445 

CTA AGG TGG ACT GGA CTA AGC ATT GAA GAA TGG TGG AGG AAT GAG CAA 1392 
Leu Arg Trp Ser Gly Val Ser lie Glu Glu Trp Trp Arg Asn Glu Gin 



35 



40 



45 



50 



55 



BNSOOCtD: <£P__p676676A2JL> 



32 



EP 0 875 575 A2 



10 



15 



20 



25 



450 455 460 

TTT TGG GTC ATC GGT GGC ATT TOG GCA CAT TTG TTC GCT GTT ATC CAA 1440 
Phe Trp Val lie Gly Gly He Ser Ala His Leu Phe Ala Val He Gin 
465 470 475 480 

GGC TTG TTG AAA GTT CTA GCT GGT ATT GAC ACT AAT TTC ACT GTC ACA 1488 
Gly Leu Leu Lys Val Leu Ala Gly He Asp Thr Asn Phe Thr Val Thr 

485 490 495 

TOC AAG GCA ACT GAT GAC GAG GAG TTC GGG GAA TTG TAT ACT TTC AAA 1536 
Ser Lys Ala Thr Asp Asp Glu Glu Phe Gly Glu Leu Tyr Thr Phe Lys 

500 505 510 

TOG ACA ACC CTT CTA ATT OCT OCT ACT AOC GTC TTA ATC ATC AAT 1TA 1584 
Trp Thr Thr Leu Leu He Pro Pro Thr Thr Val Leu He He Asn Leu 

515 520 525 

GTC GCT CTC GTT GCA GGC ATC TOG GAT GCC ATA AAC AAT GGA TAC CAA 1632 
Val Gly Val Val Ala Gly He Ser Asp Ala He Asn Asn Gly Tyr Gin 

530 535 540 

TCA TGG GGA OCT CTT TTT GGG AAG CTC TTC TTC TCT TTC TOG GTG ATT 1680 

Ser Trp Gly Pro Leu Phe Gly Lys Leu Phe Phe Ser Pte Trp Val He 
545 550 555 560 

GTC CAT CTC TAT OCA TTC CTC AAA GGT TTA ATG GGG AGA CAA AAC OGG 1728 
Val His Leu Tyr Pro Phe Leu Lys Gly Leu Met Gly Arg Gin Asn Arg 

565 570 575 

ACA CCA AOC ATT GTT GTT ATA TGG TCA GTG CTA TTG GCT TCA ATC TTC 1776 
Thr Pro Thr He Val Val He Trp Ser Val Leu Leu Ala Ser He Phe 

580 585 590 

TOC TTG CTT TGG GTC 0GA ATT GAT OCA TTT GTG ATG AAA AOC AAA GGA 1824 
Ser Leu Leu Trp Val Arg He Asp Pro Phe Val Met Lys Thr Lys Gly 

595 600 605 

OCA GAC ACT ACA ATG TGT GGC ATT AAC TCT TGAAAAAAAA TCATCTTGOG 1874 
Pro Asp Thr Thr Met Cys Gly He Asn Cys 

610 615 
TGGTTCTTTT AGATTATGGT ATGTGATGTA TGAACAAACA AGAATGGAGA TOCACAAGAC 1934 
AGAATAAAAT TAGAGTGAAA GITTTGTGTA GTTATATATT CATTCTACCA ACTATAAGTT 1994 
TTGTCATTCA ATTGAAAATA GCTCAACTTT GTGATCAAA 2033 

45 (2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 618 amino acids 

(B) TYPE: amino acid 
so (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(v) FRAGMENT TYPE: C- terminal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Pro Thr Phe Val Lys Glu Arg Arg Ala Met Lys Arg Glu Tyr Glu Glu 

1 5 10 15 

Phe Lys Val Arg lie Asn Ala Leu Val Ala Lys Ala Gin Lys Val Pro 

20 25 30 

Pro Glu Gly Trp lie Met Gin Asp Gly Thr Pro Trp Pro Gly Asn Asn 

35 40 45 

Thr Lys Asp His Pro Gly Met lie Gin Val Phe Leu Gly Gin Ser Gly 
50 55 60 

is Gly His Asp Thr Glu Gly Asn Glu Leu Pro Arg Leu Val Tyr Val Ser 

65 70 75 80 

Arg Glu Lys Arg Pro Gly Phe Leu His His Lys Lys Ala Gly Ala Met 

85 90 95 

Asn Ala Leu Val Arg Val Ser Gly Val Leu Thr Asn Ala Pro Phe Met 

100 105 110 

Leu Asn Leu Asp Cys Asp His Tyr Leu Asn Asn Ser Lys Ala Val Arg 

115 120 125 

Glu Ala Met Cys Phe Leu Met Asp Pro Gin lie Gly Arg Lys Val Cys 

130 135 140 

Tyr Val Gin Hie Pro Gin Arg Phe Asp Gly lie Asp Arg His Asp Arg 
145 150 155 160 

Tyr Ala Asn Arg Asn Thr Val Hie Phe Asp lie Asn Met Lys Gly Leu 

165 170 175 

Asp Gly lie Gin Gly Pro Val Tyr Val Gly Thr Gly Cys Val Phe Arg 

180 185 190 

Arg Gin Ala Leu Tyr Gly Tyr Glu Pro Pro Lys Gly Pro Lys Arg Pro 

195 200 205 

Lys Met Val Thr Cys Gly Cys Cys Pro Cys Phe Gly Arg Arg Arg Lys 
40 210 215 220 

Asp Lys Lys His Ser Lys Asp Gly Gly Asn Ala Asn Gly Leu Ser Leu 
225 230 235 240 

Glu Ala Ala Lys Asp Asp Lys Glu Leu Leu Met Ser His Met Asn Phe 

245 250 255 

Glu Lys Lys Fhe Gly Gin Ser Ala lie Phe Val Thr Ser Thr Leu Met 

260 265 270 

Glu Gin Gly Gly Val Pro Pro Ser Ser Ser Pro Ala Ala Leu Leu Lys 

275 280 285 

Glu Ala lie His Val lie Ser Cys Gly Tyr Glu Asp Lys Thr Glu Trp 
290 295 300 

55 Gly Ser Glu Leu Gly Trp lie Tyr Gly Ser lie Thr Glu Asp lie Leu 
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305 310 315 

Thr Gly Phe Lys Met His Cys Arg Gly Trp Arg Ser He Tyr Cys Met 

325 330 335 

Pro Lys Leu Pro Ala Phe Lys Gly Ser Ala Pro He Asn Leu Ser Asp 

340 345 350 

Arg Leu Asn Gin Val Leu Arg Trp Ala Leu Gly Ser Val Glu He Phe 

355 360 365 

Phe Ser His His Cys Pro Ala Trp Tyr Gly Phe Lys Gly Gly Lys Leu 
370 375 380 

Lys Trp Leu Glu Arg Phe Ala Tyr Val Asn Thr Thr He Tyr Pro Phe 
385 390 395 400 

Thr Ser Leu Pro Leu Leu Ala Tyr Cys Thr Leu Pro Ala He Cys Leu 

405 410 415 

Leu Thr Asp Lys Phe He Met Pro Pro He Ser Thr Phe Ala Ser Leu 

420 425 430 

Phe Phe He Ala Leu Phe Leu Ser He Phe Ala Thr Gly He Leu Glu 
435 440 445 

Leu Arg Trp Ser Gly Val Ser He Glu Glu Trp Trp Arg Asn Glu Gin 

450 455 460 

Phe Trp Val He Gly Gly He Ser Ala His Leu Phe Ala Val He Gin 
465 470 475 ^ 

Gly Leu Leu Lys Val Leu Ala Gly He Asp Thr Asn Phe Thr Val Thr 

485 490 495 

Ser Lys Ala Thr Asp Asp Glu Glu Phe Gly Glu Leu Tyr Thr Phe Lys 

500 505 510 

Trp Thr Thr Leu Leu He Pro Pro Thr Thr Val Leu He He Asn Leu 

515 520 525 

Val Gly Val Val Ala Gly He Ser Asp Ala He Asn Asn Gly Tyr Gin 

5 30 535 540 

Ser Trp Gly Pro Leu Phe Gly Lys Leu Phe Phe Ser Phe Trp Val He 
545 550 555 560 

Val His Leu Tyr Pro Phe Leu Lys Gly Leu Met Gly Arg Gin Asn Arg 

565 570 575 

Thr Pro Thr He Val Val He Trp Ser Val Leu Leu Ala Ser He Phe 

580 585 590 

Ser Leu Leu Trp Val Arg He Asp Pro Phe Val Met Lys Thr Lys Gly 

595 600 605 

Pro Asp Thr Thr Met Cys Gly He Asn Cys 
610 615 

(2) INFORMATION FOR SBQ ID NO: 7: 
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(i) SEQUENCE OiARACTERISTICS: 

(A) LENGTH: 1086 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA to mRNA 
(v±) ORIGINAL SOURCE: 

(A) ORGANISM: Gossypiura hirsutum L. 

(C) INDIVIDUAL ISOLATE: Coker312 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION : 24 . . 1086 

(x±) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGGAOGAGCT TTCATATCCT OCA ATG GAA GCC AGC GCC GGA CTC GTT GCG 50 

Met Glu Ala Ser Ala Gly Leu Val Ala 
1 5 

GOC TCT CAC AAC CGC AAT GAA CTT GTT GTC ATT CAT GGC CAT GAA GAG 98 
Gly Ser* His Asn Arg Asn Glu Leu Val Val lie His Gly His Glu Glu 

10 15 20 25 

OCT AAA OCT CTG AAG AAC TTG GAT GGT CAA GTT TGT GAG ATT TGT GGT 146 
Pro Lys Pro Lai Lys Asn Leu Asp Gly Gin Val Cys Glu lie Cys Gly 

30 35 40 

GAT GAA ATT GGG TTG ACG CTC GAT GGA GAT CTT TTC GTG GOC TGC AAC 194 
Asp Glu lie Gly Leu Thr Val Asp Gly Asp Leu Phe Val Ala Cys Asn 

45 50 55 

GAG TOT GGT TIT CCA GTT TGT AGG CCT TGT TAT GAG TAT GAA AGG AGA 242 
Glu Cys Gly Phe Pro Val Cys Arg Pro Cys Tyr Glu Tyr Glu Arg Arg 

60 65 70 

GAA GGG ACT CAA CAA TGT OCT CAA TGC AAA ACT AGA TAC AAG GGT CTC 290 

Glu Gly Ser Gin Gin Cys Pro Gin Cys Lys Thr Arg Tyr Lys Arg Leu 

75 80 85 

AAG GGG ACT COG AGG GTG GAG GGA GAT GAA GAT GAA GAG GAT GTG GAT 338 
Lys Gly Ser Pro Arg Val Glu Gly Asp Glu Asp Glu Glu Asp Val Asp 

90 95 100 105 

GAT ATC GAA CAT GAA TTC AAC ATT GAT GAT GAA CAA AAC AAG TAT AGA 386 
Asp lie Glu His Glu Phe Asn lie Asp Asp Glu Gin Asn Lys Tyr Arg 

110 115 120 

AAT ATC GCT GAA TOG ATG CTT CAT GGA AAG ATG AGC TAC GGG AGA GGC 434 
Asn lie Ala Glu Ser Met Leu His Gly Lys Met Ser Tyr Gly Arg Gly 

125 130 135 

OCT GAA GAG GAT GAA GCT TTG CAA ATC OCA COC GCT TTA GCT GOT GTT 482 
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Pro Glu Asp Asp Glu Gly Leu Gin lie Pro Pro Gly Leu Ala Gly Val 

140 145 150 

OGA OCT COG COG GTG AGC GGG GAG TTC OCA ATA GGA AGC TCT CTT GCT 530 
Arg Ser Arg Pro Val Ser Gly Glu Phe Pro lie Gly Ser Ser Leu Ala 

155 160 165 

TAT GGG GAA GAC ATG TCA AAT AAA OGA CTT CAT CCA TAT CCT ATG TCT 578 
Tyr Gly Glu His Met Ser Asn Lys Arg Val His Pro Tyr Pro Met Ser 
170 175 180 185 

GAA OCT GGA AGT GCA AGA TGG GAT GAA AAG AAA GAG GGA GGA TGG AGA 626 
Glu Pro Gly Ser Ala Arg Trp Asp Glu Lys Lys Glu Gly Gly Trp Arg 

190 195 200 

GAA AGG ATG GAT GAT TGG AAA ATG CAG CAA GGG AAT TTG GCT OCT GAA 674 
Glu Arg Met Asp Asp Trp Lys Met Gin Gin Gly Asn Leu Gly Pro Glu 

205 210 215 

OCT GAT GAT G0C TAT GAT GCT GAC ATG GCT ATG CTT GAT GAA GCT AGG 722 
Pro Asp Asp Ala Tyr Asp Ala Asp Met Ala Met Leu Asp Glu Ala Arg 

220 225 230 

CAG OCA TTG TCA AGG AAA GTG OCA ATT GCA TCG AGC AAA ATC AAT CCT 770 
Gin Pro Leu Ser Arg Lys Val Pro lie Ala Ser Ser Lys lie Asn Pro 

235 240 245 

TAT OCT ATG GTG ATT GTG GCT OCT CTA GIT ATC CTT GCT TTC TTT CTT 818 
Tyr Arg Met Val lie Val Ala Arg Leu Val lie Leu Ala Phe Phe Lai 
250 255 260 265 

30 CGC TAT OGG ATT TTG AAC OCG GTA CAT GAT GCA AIT GGG CTT TGG CTA 866 

Arg Tyr Arg lie Leu Asn Pro Val His Asp Ala lie Gly Leu Trp Leu 

270 275 280 

ACT TCT GTG ATC TGT GAA ATC TGG TTT G0C TTT TCA TGG ATC CTT GAT 914 
35 Thr Ser Val lie Cys Glu lie Trp Phe Ala Phe Ser Trp lie Leu Asp 

285 290 295 

CAG TTC OCT AAA TGG TTC OCT ATT GAC CGC GAG AGG TAT CTC GAT CGC 962 
Gin Phe Pro Lys Trp Phe Pro lie Asp Arg Glu Thr Tyr Leu Asp Arg 
40 300 305 310 

CTT T0C CTC AGG TAT GAG AGG GAA GGT GAG CGC AAC ATG CTT GCT TCT 1010 
Leu Ser Leu Arg Tyr Glu Arg Glu Gly Glu Pro Asn Met Leu Ala Ser 
315 320 325 

45 CTT GAT ATT TTT GTC AGT ACA GTG GAT OCA TTG AAG GGA OCT OCT CTA 1058 

Val Asp lie Phe Val Ser Thr Val Asp Pro Leu Lys Gly Pro Pro Leu 
330 335 340 345 

GTA ACA GOG AAT ACA GTT CTA TOG ATC T 1086 
Val Thr Ala Asn Thr Val Leu Ser lie 

350 



50 
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(2) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 354 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: N- terminal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
Met Glu Ala Ser Ala Gly Leu Val Ala Gly Ser His Asn Arg Asn Glu 

15 10 15 

Leu Val Val lie His Gly His Glu Glu Pro Lys Pro Leu Lys Asn Leu 

20 25 30 

Asp Gly Gin Val Cys Glu lie Cys Gly Asp Glu lie Gly Leu Thr Val 

35 40 45 

Asp Gly Asp Leu Phe Val Ala Cys Asn Glu Cys Gly Fhe Pro Val Cys 

50 55 60 

Arg Pro Cys Tyr Glu Tyr Glu Arg Arg Glu Gly Ser Gin Gin Cys Pro 

65 70 75 80 

Gin Cys Lys Thr Arg Tyr Lys Arg Leu Lys Gly Ser Pro Arg Val Glu 

85 90 95 

Gly Asp Glu Asp Glu Glu Asp Val Asp Asp lie Glu His Glu Phe Asn 

100 105 110 

lie Asp Asp Glu Gin Asn Lys Tyr Arg Asn lie Ala Glu Ser Met Leu 

115 120 125 

His Gly Lys Met Ser Tyr Gly Arg Gly Pro Glu Asp Asp Glu Gly Leu 

130 135 140 

Gin lie Pro Pro Gly Leu Ala Gly Val Arg Ser Arg Pro Val Ser Gly 
145 150 155 160 

Glu Pile Pro lie Gly Ser Ser Leu Ala Tyr Gly Glu His Met Ser Asn 

165 170 175 

Lys Arg Val His Pro Tyr Pro Met Ser Glu Pro Gly Ser Ala Arg Trp 

180 185 190 

Asp Glu Lys Lys Glu Gly Gly Trp Arg Glu Arg Met Asp Asp Trp Lys 

195 200 205 

Met Gin Gin Gly Asn Leu Gly Pro Glu Pro Asp Asp Ala Tyr Asp Ala 

210 215 220 

Asp Met Ala Met Leu Asp Glu Ala Arg Gin Pro Leu Ser Arg Lys Val 
225 230 235 240 

Pro lie Ala Ser Ser Lys lie Asn Pro Tyr Arg Met Val lie Val Ala 

245 250 255 
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Arg Leu 

Val His 

Trp Phe 
290 
He Asp 
305 

Glu Gly 
Val Asp 
Ser He 



Val He Leu Ala Phe Phe Leu Arg Tyr Arg He Leu Asn Pro 

260 265 270 

Asp Ala He Gly Leu Trp Leu Thr Ser Val lie Cys Glu He 
275 280 285 

Ala Phe Ser Trp He Leu Asp Gin Phe Pro Lys Trp Phe Pro 

295 300 
Arg Glu Thr Tyr Leu Asp Arg Leu Ser Leu Arg Tyr Glu Arg 

310 315 320 

Glu Pro Asn Met: U*u Ala Ser Val Asp He Phe Val Ser Thr 
325 330 335 

Pro Leu Lys Gly Pro Pro Leu Val Thr Ala Asn Thr Val Leu 
340 345 350 



20 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGIH: 1000 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA to mRNA 
(vi) ORIGINAL SOURCE: 
30 < A > ORGANISM: Gossypium hirsutum L. 

(C) INDIVIDUAL ISOLATE: Coker312 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

35 (B) LOCATION: 1 . . 1000 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 9: 

GAC AAA GTC OGG CCG ACA TTC GFG AAG GAG CGT CGA GCT ATG AAG AGA 

Asp Lys Val Arg Pro Thr Phe Val Lys Glu Arg Arg Ala Met Lys Arg 

to „ 1 5 10 15 

GAA TAT GAA GAA TTC AAG GTT AGG ATA AAT GCA CTT GTA GOC AAA GCC 

Glu Tyr Glu Glu Phe Lys Val Arg He Asn Ala Leu Val Ala Lys Ala 

20 25 30 

^ CAA AAG GTT OCT OCA GAA GOG TGG ATC ATG CAA GAT GGG ACA CCA TGG 

Gin Lys Val Pro Pro Glu Gly Trp He Met Gin Asp Gly Thr Pro Trp 

35 40 45 

CCA GGA AAC AAT ACT AAA GAT CAC OCT GGT ATG ATT CAA GTA TTT CTC 

Pro Gly Asn Asn Thr Lys Asp His Pro Gly Met He Gin Val Phe Leu 
50 55 60 
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GGT CAA ACT GGA GGC CAT GAT AOC GAA GGA AAT GAG CTT OCT CGT CTC 240 
Gly Gin Ser Gly Gly His Asp Thr Glu Gly Asn Glu Leu Pro Arg Leu 

65 70 75 80 

CTC TAT GTA TCT OGA GAG AAA AGG OCA GGT TTC TTG CAT CAC AAG AAA 288 
Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Phe Leu His His Lys Lys 

85 90 95 

GCT GGT GOC ATG AAC GOC CTT CTT OGT CTC TOG GGG GTG CTT ACA AAT 336 
Ala Gly Ala Met Asn Ala Leu Val Arg Val Ser Gly Val Leu Thr Asn 

100 105 110 

GCT OCT TTT ATG TTG AAC TTG GAT TCT GAC CAC TAT TTA AAT AAC AGC 384 
Ala Pro Phe Met Leu Asn Leu Asp Cys Asp His Tyr Leu Asn Asn Ser 

115 120 125 

AAG GCT CTA AGA GAG GCT ATG TGT TTC TTG ATG GAC CCT CAA ATT GGA 432 
Lys Ala Val Arg Glu Ala Met Cys Phe Leu Met Asp Pro Gin lie Gly 
130 135 140 

20 AGA AAG CTT TGC TAT GTC CAA TTC OCT CAA CGT TTC GAT GGT ATT GAT 480 

Arg Lys Val Cys Tyr Val Gin Phe Pro Gin Arg Phe Asp Gly lie Asp 
145 150 155 160 

AGA CAT GAT OGA TAT GOC AAT CGG AAC ACA GIT TTC TTT GAT ATT AAC 528 
25 Arg His Asp Arg Tyr Ala Asn Arg Asn Thr Val Pte Phe Asp He Asn 

165 170 175 

ATG AAA GGT CTA GAT GGT ATA CAA GGC OCT GTA TAT GTC GGC AOG GGG 576 
Met Lys Gly Leu Asp Gly He Gin Gly Pro Val Tyr Val Gly Thr Gly 

180 185 190 

TCT CTT TTC AGA AGG CAA GCT CTT TAT GGT TAT GAA OCT OCA AAG GGA 624 
Cys Val Pte Arg Arg Gin Ala Leu Tyr Gly Tyr Glu Pro Pro Lys Gly 

195 200 205 

OCT AAG OGC COG AAA ATG CTA AOC TCT GCT TGC TGC CCT TGC TTT GGA 672 
Pro Lys Arg Pro Lys Met Val Thr Cys Gly Cys Cys Pro Cys Phe Gly 

210 215 220 

OGC OGC AGA AAG GAC AAA AAG CAC TCT AAG GAT GOT GGA AAT GCA AAT 720 
Arg Arg Arg Lys Asp Lys Lys His Ser Lys Asp Gly Gly Asn Ala Asn 
225 230 235 240 

GOT CTA AGC CTA GAA GCA GCC GAA GAT GAC AAG GAG TTA TTG ATG TOC 768 

Gly Leu Ser Leu Glu Ala Ala Glu Asp Asp Lys Glu Leu Leu Met Ser 

245 250 255 

CAC ATG AAC TTT GAA AAG AAA TTT GGA CAA TCA GCC ATT TTT CTA ACT 816 
His Met Asn Phe Glu Lys Lys Phe Gly Gin Ser Ala lie Phe Val Thr 

260 265 270 

TCA ACA CTG ATG GAA CAA GCT GCT CTC OCT CCT TCT TCA AGC OCT GCA 864 

Ser Thr Leu Met Glu Gin Gly Gly Val Pro Pro Ser Ser Ser Pro Ala 
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275 280 285 

OCT TTG CTC AAA GAA GOC ATT CAT GTA ATT AGT TGT GOT TAT GAA GAC 912 

5 Ala Leu Leu Lys Glu Ala He His Val He Ser Cys Gly Tyr Glu Asp 

290 295 300 

AAA AOC GAA TGG GGA AGC GAG CTT GGC TGG ATT TAG GGC TOG ATT ACA 
Lys Thr Glu Trp Gly Ser Glu Leu Gly Trp He Tyr Gly Ser He Thr 
305 310 315 320 

GAA GAT ATC TTA ACA GGT TTC AAG ATO CAT TOC OCT GGA T 
Glu Asp He Leu Thr Gly Phe Lys Met His Cys Arg Gly 

325 330 



15 



20 
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(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 333 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal fragment 

(x±) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Asp Lys Val Arg Pro Thr Phe Val Lys Glu Arg Arg Ala Met Lys Arg 
1 5 io 15 

Glu Tyr Glu Glu Phe Lys Val Arg lie Asn Ala Leu Val Ala Lys Ala 

20 25 30 

Gin Lys Val Pro Pro Glu Gly Trp He Met Gin Asp Gly Thr Pro Trp 
35 40 45 

Pro Gly Asn Asn Thr Lys Asp His Pro Gly Met He Gin Val Phe Leu 

50 55 60 

Gly Gin Ser Gly Gly His Asp Thr Glu Gly Asn Glu Leu Pro Arg Leu 

65 70 nr 

/u 75 80 

Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Phe Leu His His Lys Lys 

85 90 95 

Ala Gly Ala Met Asn Ala Leu Val Arg Val Ser Gly Val Leu Thr Asn 

100 105 no 

Ala Pro Phe Met Leu Asn Leu Asp Cys Asp His Tyr Leu Asn Asn Ser 

115 120 125 

Lys Ala Val Arg Glu Ala Met Cys Phe Leu Met Asp Pro Gin He Gly 

130 135 140 

Arg Lys Val Cys Tyr Val Gin Phe Pro Gin Arg Phe Asp Gly n e Asp 

145 150 155 i6o 

Arg His Asp Arg Tyr Ma Asn Arg Asn Thr Val Phe Phe Asp lie Asn 

165 170 1?5 
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Met Lys Gly Leu Asp Gly lie Gin Gly Pro Val Tyr Val Gly Thr Gly 

180 185 190 

Cys Val Phe Arg Arg Gin Ala Leu Tyr Gly Tyr Glu Pro Pro Lys Gly 

195 200 205 

Pro Lys Arg Pro Lys Met Val Thr Cys Gly Cys Cys Pro Cys Phe Gly 

210 215 220 

Arg Arg Arg Lys Asp Lys Lys His Ser Lys Asp Gly Gly Asn Ala Asn 
225 230 235 240 

Gly Leu Ser Leu Glu Ala Ala Glu Asp Asp Lys Glu Leu Leu Met Ser 

245 250 255 

His Met Asn Phe Glu Lys Lys Phe Gly Gin Ser Ala lie Phe Val Ttir 

260 265 270 

Ser Thr Leu Met Glu Gin Gly Gly Val Pro Pro Ser Ser Ser Pro Ala 

275 280 285 

Ala Leu Leu Lys Glu Ala lie His Val lie Ser Cys Gly Tyr Glu Asp 

290 295 300 

Lys Thr Glu Trp Gly Ser Glu Leu Gly Trp lie Tyr Gly Ser lie Thr 
305 310 315 320 

Glu Asp lie Leu Thr Gly Phe Lys Met His Cys Arg Gly 

325 330 

(2) INFORMATION FOR SBQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 622 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: C- terminal fragment 
(ix) FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 

(D) OTHER INFORMATION: Xaa indicates Glu or Lys 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

4s Asp Lys Val Arg Pro Thr Phe Val Lys Glu Arg Arg Ala Met Lys Arg 

15 10 15 

Glu Tyr Glu Glu Phe Lys Val Arg lie Asn Ala Leu Val Ala Lys Ala 

20 25 30 

50 Gin Lys Val Pro Pro Glu Gly Trp lie Met Gin Asp Gly Thr Pro Trp 

35 40 45 

Pro Gly Asn Asn Thr Lys Asp His Pro Gly Met lie Gin Val Phe Leu 
50 55 60 

55 
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Gly Gin Ser Gly Gly His Asp Thr Glu Gly Asn Glu Leu Pro Arg Leu 

65 70 75 80 

Val Tyr Val Ser Arg Glu Lys Arg Pro Gly Phe Leu His His Lys Lys 

85 go 95 

Ala Gly Ala Met Asn Ala Leu Val Arg Val Ser Gly Val Leu Thr Asn 

1Q 0 105 no 

AlaProPteMrtieuAsnLeuAspCysAspHlsTyrLeuAsntenSer 

115 120 125 

Lys Ala Val Arg Glu Ala Met Cys Phe Leu Met Asp Pro Gin He Gly 
130 135 140 

Arg Lys Val Cys Tyr Val Gin Phe Pro Gin Arg Phe Asp Gly He Asp 
145 150 155 i6o 

Arg His Asp Arg Tyr Ala Asn Arg Asn Thr Val Phe Phe Asp lie Asn 

165 170 175 

Met Lys Gly Leu Asp Gly He Gin Gly Pro Val Tyr Val Gly Thr Gly 

180 185 igo 

Cys Val Phe Arg Arg Gin Ma Leu Tyr Gly Tyr Glu Pro Pro Lys Gly 
195 200 205 

Pro Lys Arg Pro Lys Met Val Thr Cys Gly Cys Cys Pro Cys Phe Gly 
210 215 220 

Arg Arg Arg Lys Asp Lys Lys His Ser Lys Asp Gly Gly Asn Ala Asn 
225 230 235 240 

Gly Leu Ser Leu Glu Ala Ala Xaa Asp Asp Lys Glu Leu Leu Met Ser 

245 250 255 

His Met Asn Phe Glu Lys Lys Phe Gly Gin Ser Ala He Phe Val Thr 

260 265 270 

Ser Thr Leu Met Glu Gin Gly Gly Val Pro Pro Ser Ser Ser Pro Ala 

275 280 285 

Ala Leu Leu Lys Glu Ala He His Val He Ser Cys Gly Tyr Glu Asp 

290 295 300 

Lys Thr Glu Trp Gly Ser Glu Leu Gly Trp He Tyr Gly Ser He Thr 

305 310 -ai c 

• 3XU 315 320 

Glu Asp He Leu Thr Gly Phe Lys Met His Cys Arg Gly Trp Arg Ser 

325 330 335 

He Tyr Cys Met Pro Lys Leu Pro Ala Phe Lys Gly Ser Ala Pro He 

340 345 350 

Asn Leu Ser Asp Arg Leu Asn Gin Val Leu Arg Trp Ala Leu Gly Ser 
355 360 365 

Val Glu lie Phe Phe Ser His His Cys Pro Ala Trp Tyr Gly Phe Lys 
370 375 380 

Gly Gly Lys Leu Lys Trp Leu Glu Arg Phe Ala Tyr Val Asn Thr Thr 
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385 390 395 400 

lie Tyr Pro Phe Thr Ser Leu Pro Leu Leu Ala Tyr Cys Thr Leu Pro 

405 410 415 

Ala lie Cys Leu Leu Thr Asp Lys Phe II Met Pro Pro lie Ser Thr 

420 425 430 

Phe Ala Ser Leu Phe Phe lie Ala Leu Pte Leu Ser lie Phe Ala Thr 

435 440 445 

Gly lie Leu Glu Leu Arg Trp Ser Gly Val Ser lie Glu Glu Trp Tip 

450 455 460 

Arg Asn Glu Gin Phe Trp Val He Gly Gly He Ser Ala His Leu Phe 
465 470 475 480 

Ala Val He Gin Gly Leu Leu Lys Val Leu Ala Gly He Asp Thr Asn 

485 490 495 

Phe Thr Val Thr Ser Lys Ala Thr Asp Asp Glu Glu Phe Gly Glu Leu 

500 505 510 

Tyr Thr Phe Lys Trp Thr Thr Leu Leu He Pro Pro Thr Thr Val Leu 

515 520 525 

lie He Asn Leu Val Gly Val Val Ala Gly He Ser Asp Ala He Asn 

530 535 540 

Asn Gly Tyr Gin Ser Trp Gly Pro Leu Phe Gly Lys Leu Phe Phe Ser 
545 550 555 560 

Phe Trp Val He Val His Leu Tyr Pro Phe Leu Lys Gly Leu Met Gly 

565 570 575 

Arg Gin Asn Arg Thr Pro Thr He Val Val He Trp Ser Val Leu Leu 

580 585 590 

Ala Ser He Phe Ser Leu Leu Trp Val Arg He Asp Pro Phe Val Met 

595 600 605 

Lys Thr Lys Gly Pro Asp Thr Thr Met Cys Gly He Asn Cys 
610 „ 615 620 

(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: internal fragment 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
Gin Xaa Xaa Xaa Xaa Xaa Xaa Arg Trp 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TOPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TOPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GAGAGAGAGA GAGAGAGAGA ACTAGTCTOG AGTTTTTTTT ' 1TXT1T1T1T 50 



(2) INFORMATION FOR SEQ ID NO: 14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TOPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECUI£ TYPE: otter nucleic acid 

(A) DESCRIPTION: /desc - "Synttetic DNA" 
25 (Jjx) FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 1 . . 4 

(D) OTHER INFORMATION: single strand 
30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

AATTCGQCAC GAG 

(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GACTGAAGAT AAGCCAAAAG 

45 

(2) INFORMATION FOR SEQ ID NO: 16: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 19 base pairs 
so (B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 16: 
GGAATGATGA ATTTQOOQG 

(2) INFOTWATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TGCAGQCAAC TTTGGCATGC 

(2) INFORMATION FOR SEQ ID NO: 18: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc * "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 18: 
AGCAACACGA GCAAGATGAG GAGGATGACT 

(2) INFORMATION FOR SEQ ID NO: 19: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
OCX3GATCCTT CAADOCTICT TCGATTTC 

(2) INFORMATION FOR SEQ ID NO: 20: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc « "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
OOGGATOCAC GGCAATGCAT CTPGSAAACC 

(2) INFORMATION FOR SEQ ID NO: 21: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: Unoar 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
QCTTAGCATA TTCTTTCTAG CATTGGG 

(2) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic DMA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
ATCAATGAAA TATCTATAGT TCATAGC 

(2) INFORMATION FOR SEQ ID NO: 23: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sir^le 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc * "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
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CTl ' lOgnCl' TTTGGTTTTG OCATGGC 27 

(2) INFORMATION FOR SEQ ID NO: 24: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENOTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: otter nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic DNA" 
(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 24: 
AGACTTTTTA CAAACAAGAT AAATCOC 27 



Claims 

1 . A DN A coding for any one of the following proteins (A) to (C): 

(A) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID 
NO: 2 or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino 
acids relevant to SEQ ID NO: 2; 

(B) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID 
NO: 4 or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino 
acids relevant to SEQ ID NO: 4; and 

(C) a protein having a cellulose synthase activity and comprising an amino acid sequence shown in SEQ ID 
NO: 8 or an amino acid sequence involving deletion, substitution, insertion, or addition of one or several amino 
acids relevant to SEQ ID NO: 8, and an amino acid sequence shown in SEQ ID NO: 11 or an amino acid 
sequence involving deletion, substitution, insertion, or addition of one or several amino acids relevant to SEQ 
ID NO: 11. 

2. A recombinant vector comprising all or a part of the DNA as defined in claim 1 . 

3. A transformed cell transformed with the DNA as defined in claim 1 . 

4. A method for controlling cellulose synthesis in a cell, comprising the steps of introducing the DNA as defined in 
claim 1 into the cell, and expressing RNA having a nucleotide sequence homologous to the DNA as defined in 
claim 1 or a nucleotide sequence complementary to the DNA as defined in claim 1 . 
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PcsA3 




PcsA3-5' PcsA3-3' 



«— — . 

PcsA3-682 



FIG. 1 



SEQ ID NO: 14 



5 ' 
3 ' 



AATTCGGCACGAG 3' 
GCCGTGCTC 5' 



FIG. 2 
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10 20 30 40 50 60 

PcsA3-682 CCGACATTCGTOAAGGAGCGTCGAGCTATGAAGAGAGAATATGAAGAATTCAAGGTTAGG 
(SEQ 10 NO: 5) . . . . . . : . : : ■ . . . . . . . ........... . . ............................. 

PcsA3-3* CCGACATTCGTGAAGGAGCGTCGAGCTATGAAGAGAGAATATGAAGAATTCAAGGTTAGG 
(SEQ (D MO: 9) 20 30 40 50 60 70 

70 80 90 100 MO 120 

PcsA3-682 ATAMTGCACTTGTAGCCAAAGCCCAAAAGGTTCCTCCAGAAGGGTGGArCATGCAAGAT 



PcsA3-3* ATAAATGCACTTGTAGCCAAAGCCCAAAAGGTTCCTCCAGAAGGGTGGATCATGCAAGAT 

80 90 100 110 120 130 

130 140 150 160 170 180 

PcsA3-682 GGGACACCATGGCCAGGAAACAATACTAAAGATCACCCTGGTATGATTCAAGTATTTCTC 



PcsA3-3' GGGACACCATGGCCAGGAAACAATACTAAAGATCACCCTGGTATGATTCAAGTATTTCTC 

140 150 160 170 180 190 

190 200 210 220 230 240 

PcsA3-682 GGTCAAAGTGGAGGCCATGATACCGAAGGAAATGAGCTTCCTCGTCTCGTCTATGTATCT 



PcsA3-3 ' GGTCAAAGTGGAGGCCATGATACCGAAQGAAATGAGCTTCCTCGTCTCGTCTATGTATCT 

200 210 220 230 240 250 

250 260 270 280 290 300 

PcsA3-682 CGAGAGAAAAGGCCTGGTTTCTTGCATCACAAGAAAGCTGGTGCCATGAACGCCCTTGTT 

::::::::::::::*::::::::::::::::::::::::::::::::::::::::::::: 
CGAGAGAAAAGGCCAGGTTTCTTGCATC AC AAGAAAQCTGGT GCCAT GAACGCCCTT G TT 
260 270 280 290 300 310 

310 320 330 340 350 360 

CGGGTCTCGGGGGTGCTCACAMTGCTCCTTTTATGTTGAACTTGGATTGTGACCATTAT 

PcsA3-3' CGTGTCTCGGGGGTGCTTACAAATGCTCCTTTTATGTTGAACTTGGATTGTGACCACTAT 

3 20 330 340 350 360 370 

370 380 390 400 410 420 

PcsA3-682 TTAAATAACAGCAAGGCTGTAAGAGAGGCTATGTGTTTCTTGATGGACCCTCAAATTGGA 



PcsA3-3' TTAAATAACAGCAAGGCTGTAAGAGAGGCTATGTGTTTCTTGATGGACCCTCAAATTGGA 

380 390 400 410 420 430 

430 440 450 460 470 480 

PcsA3-682 AGAAAGGTTTGCTATGTCCAATTCCCTCAACGTTTCGATGGTATTGATAGACATGATCGA 



PcsA3-3' AGAAAGGTTTGCTATGTCCAATTCCCTCAACGTTTCGATGGTATTGATAGACATGATCGA 

440 450 460 470 480 490 

490 500 510 520 530 540 

PcsA3-682 TATGCCAATCGGAACACAGTTTTCTTTGATATTAACATGAAAGGTCTAGATGGTATACAA 



PcsA3-3' TATGCCAATCGGAACACAGTTTTCTTTGATATTAACATGAAAGGTCTAGATGGTATACAA 

500 510 520 530 540 550 
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550 560 570 580 590 600 

PcsA3-682 GGCCCTGTATATGTCGGCACGGGGTGTGTTTTCAGAAGGCAAGCTCTTTATGGTTATGAA 
(SEQ 10 Wh 5) 

PcsA3-3' GGCCCTGTATATGTCGGCACGGGGTGTGTTTTCAGAAGGCAAGCTCTTTATGGTTATGAA 
(SEQ 10 NO: 9) 560 570 580 590 600 610 

610 620 630 840 650 660 

PcsA3-682 CCTCCAAAGGGACCTAAGCGCCCGAAAATGGTAACCTGTGGTTGCTGCCCTTGTTTTGGA 



PcsA3-3* CCTCCAAAGGGACCTAAGCGCCCGAAAATGGTAACCTGTGGTTGCTGCCCTTGCTTTGGA 

620 630 640 650 660 670 

670 680 690 700 710 720 

PcsA3~682 CGCCGCAGAAAGGACAAAAAGGAGTCTAAGGATGGTGGAAATGGAAATGGTGTAAGGGTA 



PcsA3-3* CGCCGCAGAAAGGAjCAAAMGCAGTGTAAGGATGGTGGAAATGCAAATGGTCTAAGCCTA 

680 690 700 710 720 730 

730 740 750 760 770 780 

PcaA3-682 GMGCAG<XAMGATGACAAGGAGTTATT(^TGT<^^ 



PcsA3-3* GAAGCAGCCGAA6ATGACAAGGAG TTAT TGATGTCCCACATGAACTTTGAAAAGAAATTT 

740 750 760 770 780 790 

790 800 810 820 830 840 

PcsA3-682 GGACAATCAGCCATTTTTGTAACTTCAACACTGATGGAACAAGGTGGTGTCCCTCCTTCT 



PcsA3-3* GGACAATCAGCCATTTTTGTAACTTCAACACTGATGGAACAAGGTGGTGTCCCTCCTTCT 

600 810 820 830 840 850 

850 860 870 880 890 900 

PcsA3-682 TCAAGCCCCGCAGCTTTGCTCAAAGAAGCCATTCATGTAATTAGTTGTGGTTATGAAGAC 



PcsA3-3 ' TCAAGCCCTGCAGCTTTGCTCAAAGAAGCGATTCATGTAATTAGTTGTGGTTATGAAGAC 

360 870 880 890 900 910 

910 920 930 940 950 960 

PcsA3-682 AAAACAGAATGGGGAAGCGAGCTTGGCTGGATTTACGGCTCGATTACAGAAGATATCTTA 



PcsA3-3' AAAACCGAATGGGGAAGCGAGCTTGGCTGGATTTACGGCTCGATTACAGAAGATATCTTA 

920 930 940 950 960 970 

970 980 
PcsA3-682 ACAGGATTCAAGATGCATTGCCGTGGAT 



Pes A3-3 ' ACAGGTTTCAAGATGCATTGCCGTGGAT 

980 990 1 000 



FIG. 4 
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1. Claims: 1-4 (partially) 

Claims 1- 4 (partially) refer to a DNA coding for a protein 
having a cellulose synthase activity and comprising an amino 
acid sequence shown in SEQ ID N0:2 or an amino acid sequence 
involving deletion, substitution, insertion, or addition of 
one or several amino acids relevant to SEQ ID N0:2. a 
recombinant vector comprising all or part of said DNA, a 
cell being transformed with said DNA , and a method for 
controlling cellulose synthesis in a cell by the use of said 
DNA. 



2. Claims: 1-4 (partially) 



Claims 1- 4 (partially) refer to a DNA coding for a protein 
having a cellulose synthase activity and comprising an amino 
acid sequence shown in SEQ ID N0:4 or an amino acid sequence 
involving deletion, substitution, insertion, or addition of 
one or several amino acids relevant to SEQ ID N0:4, a 
recombinant vector comprising all or part of said DNA, a 
cell being transformed with said DNA, and a method for 
controlling cellulose synthesis in a cell by the use of said 



3. Claims: 1-4 (partially) 

Claims 1- 4 (partially) refer to a DNA coding for a protein 
having a cellulose synthase activity and comprising an amino 
acid sequence shown in SEQ ID N0:8 and in SEQ ID NO: 11 or an 
amino acid sequence involving deletion, substitution, 
insertion, or addition of one or several amino acids 
relevant to SEQ ID N0:8 and/or SEQ ID NO: 11, a recombinant 
vector comprising all or part of said DNA, a cell being 
transformed with said DNA, and a method for controlling 
cellulose synthesis in a cell by the use of said DNA. 
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